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Abstract 

Wc study the problem of obtaining efficient, deterministic, black-box polynomial identity 
testing algorithms for depth-3 set- multilinear circuits (over arbitrary fields) . This class of circuits 
has an efficient, deterministic, white-box polynomial identity testing algorithm (due to Raz and 
Shpilka [RS( ).">]), but has no known such black-box algorithm. We recast this problem as a 
question of finding a low-dimensional subspace "H, spanned by rank 1 tensors, such that any non- 
zero tensor in the dual space ker('H) has high rank. We obtain explicit constructions of essentially 
optimal-size hitting sets for tensors of degree 2 (matrices), and obtain quasi-polynomial sized 
hitting sets for arbitrary tensors (but this second hitting set is less explicit). 

We also show connections to the task of performing low-rank recovery of matrices, which 
is studied in the field of compressed sensing. Low-rank recovery asks (say, over M) to recover 
a matrix M from few measurements, under the promise that M is rank < r. In this work, 
we restrict our attention to recovering matrices that are exactly rank < r using deterministic, 
non-adaptive, linear measurements, that are free from noise. Over R, we provide a set (of size 
4nr) of such measurements, from which M can be recovered in 0{rn^ -\- r^n) field operations, 
and the number of measurements is essentially optimal. Further, the measurements can be 
taken to be all rank-1 matrices, or all sparse matrices. To the best of our knowledge no explicit 
constructions with those properties were known prior to this work. 

We also give a more formal connection between low-rank recovery and the task of sparse 
(vector) recovery: any sparse- recovery algorithm that exactly recovers vectors of length n and 
sparsity 2r, using m non-adaptive measurements, yields a low-rank recovery scheme for exactly 
recovering n x n matrices of rank < r, making 2nm non-adaptive measurements. Furthermore, 
if the sparse-recovery algorithm runs in time r, then the low-rank recovery algorithm runs in 
time 0{rn'^ -\~ nr). We obtain this reduction using linear-algebraic techniques, and not using 
convex optimization, which is more commonly seen in compressed sensing algorithms. 

Finally, we also make a connection to rank-metric codes, as studied in coding theory. These 
are codes with codewords consisting of matrices (or tensors) where the distance of matrices A 
and B is rank(A — B), as opposed to the usual hamming metric. We obtain essentially optimal- 
rate codes over matrices, and provide an efficient decoding algorithm. We obtain codes over 
tensors as well, with poorer rate, but still with efficient decoding. 
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1 Introduction 



We start with a motivating example. Let x and y be vectors of n variables each. Let M be an 
n X n matrix (over some field, say M), and define the quadratic form 

/M(x,y)=^x1'My . 

Suppose now that we are given an oracle to /m, that can evaluate /a/ on inputs (x, y) that we 
supply. The type of question we consider is: how many (deterministically chosen) evaluations of 
/m niust we make in order to determine whether A is non-zero? 

It is not hard to show that evaluations to /m are necessary and sufficient to determine whether 
A is non-zero. The question becomes more interesting when we are promised that rank(M) < r. 
That is, given that rank(M) < r, can we (deterministically) determine whether M = using <^ n? 
evaluations of /m? It is not hard to show that there (non-explicitly) exist ~ 2nr evaluations to 
determine whether M = 0, and one of the new results in this paper is to give an explicit construction 
of 2nr such evaluations (over R). 

We also consider various generalizations of this problem. The first generalization is to move 
from matrices (which are in a sense 2 dimensional) to the more general notion of tensors (which 
are in a sense d-dimensional) . That is, a tensor is a map T : [n]'^ — >• F and like a matrix we can 
define a polynomial 

d 

fT{xi,l,...,Xi^n,---,Xd,l,...,Xd,n)= ^ T{ii, . . . , id) ■ 

il,...,id&[n] j=l 

As with matrices, tensors have a notion of rank (defined later), and we can ask: given that rank(r) < 
r how many (deterministically chosen) evaluations of /t are needed to determine whether T = 0. 
As T = iff /t = 0, we see that this problem is an instance of polynomial identity testing, which 
asks: given oracle access to a polynomial / that is somehow "simple" , how many (deterministically 
chosen) queries to / are needed to determine whether / = 0? 

The above questions ask whether a certain matrix or tensor is zero. However, we can also ask 
for more, and seek to reconstruct this matrix/tensor fully. That is, how many (deterministically 
chosen) evaluations to /m are needed to determine M? This question can be seen to be related to 
compressed sensing and sparse recovery, where the goal is to reconstruct a "simple" object from 
"few" measurements. In this case, "simple" refers to the matrix being low-rank, as opposed to 
a vector being sparse. As above, it is not hard to show that there exist ~ 4nr evaluations that 
determine M, and this paper gives an explicit construction of 4nr such evaluations, as well as an 
efficient algorithm to reconstruct M from these evaluations. 

We will now place this work in a broader context by providing background on polynomial 
identity testing, compressed sensing and low-rank recovery, and the theory of rank-metric codes. 

1.1 Polynomial Identity Testing 

Polynomial identity testing (PIT) is the problem of deciding whether a polynomial (specified by an 
arithmetic circuit) computes the identically zero polynomial. The obvious deterministic algorithm 
that completely expands the polynomial unfortunately takes exponential time. This is in contrast 
to the fact that there are several (quite simple) randomized algorithms that solve this problem quite 
efficiently. Further, some of these randomized algorithms treat the polynomial as a black-box, so 
that they only use the arithmetic circuit to evaluate the polynomial on chosen points, as opposed 
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to a white-box algorithm which can examine the internal structure of the circuit. Even in the 
white-box model, no efficient deterministic algorithms are known for general circuits. 

Understanding the deterministic complexity of PIT has come to be an important problem in 
theoretical computer science. Starting with the work of Kabanets and Impagliazzo [KI04], it has 
been shown that the existence of efficient deterministic (white-box) algorithms for PIT has a tight 
connection with the existence of explicit functions with large circuit complexity. As proving lower 
bounds on circuit complexity is one of the major goals of theoretical computer science, this has led 
to much research into PIT. 

Stronger connections are known when the deterministic algorithms are black-box. For, any 
such algorithm corresponds to a hitting set, which is a set of evaluation points such that any small 
arithmetic circuit computing a non-zero polynomial must evaluate to non-zero on at least one point 
in the set. Heintz and Schnorr [HS80], as well as Agrawal [Agr05], showed that any deterministic 
black-box PIT algorithm very easily yields explicit polynomials that have large arithmetic circuit 
complexity. Moreover, Agrawal and Vinay [AV08] showed that a deterministic construction of a 
polynomial size hitting set for arithmetic circuits of depth-4 gives rise to a quasi-polynomial sized 
hitting set for general arithmetic circuits. Thus, the black-box deterministic complexity of PIT 
becomes interesting even for constant-depth circuits. However, currently no polynomial size hitting 
sets are known for general depth-3 circuits. Much of recent work on black-box deterministic PIT 
has identified certain subclasses of circuits for which small hitting sets can be constructed, and this 
work fits into that paradigm. See [SYIO] for a survey of recent results on PIT. 

One subclass of depth-3 circuits is the model of set-multilinear depth-3 circuits, ffi'st introduced 
by Nisan and Wigderson [NW96]. Raz and Shpilka [RS05] gave a polynomial-time white-box PIT 
algorithm for non-commutative arithmetic formulas, which contains set-multilinear depth-3 circuits 
as a subclass. However, no polynomial-time black-box deterministic PIT algorithm is known for set- 
multilinear depth-3 circuits. The best known black-box PIT results for the class of set-multilinear 
circuits, with top fan-in < r and degree d, are hitting sets of size min(n'^, poly((n(i)^')), where the 
first part of bound comes from a simple argument (presented in Lemma 3.11), and the second part 
of the bound ignores that we have set-multilinear polynomials, and simply uses the best known 
hitting sets for so-called SnS(/c) circuits as established by Saxena and Seshadhri [SSll]. For non- 
constant d and r, these bounds are super-polynomial. Improving the size of these hitting sets is 
the primary motivation for this work. 

To connect PIT for set-multilinear depth-3 circuits with the above questions on matrices and 
tensors, we now note that any such circuit of top fan-in < r, degree d, on dn variables (and thus 
size < dnr), computes a polynomial fx, where T is an [n]'^ tensor of rank < r. Conversely, any 
such fx can be computed by such a circuit. Thus, constructing better hitting sets for this class of 
circuits is exactly the question of finding smaller sets of (deterministically chosen) evaluations to 
Jt to determine whether T = 0. 

1.2 Low- Rank Recovery and Compressed Sensing 

Low-rank Recovery (LRR) asks (for matrices) to recover annxn matrix M from few measurements 
of M. Here, a measurement is some inner product {M,H), where H is an n x n matrix and the 
inner product (•, •) is the natural inner product on long vectors. This can be seen as the natural 
generalization of the sparse recovery problem, which asks to recover sparse vectors from few linear 
measurements. For, over matrices, our notion of sparsity is simply that of being low-rank. 

Sparse recovery and compressed sensing are active areas of research, see for example [CSw]. 
Much of this area focuses on constructing distributions of measurements such that the unknown 
sparse vector can be recovered efficiently, with high probability. Also, it is often assumed that the 
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sequence of measurements will not depend on any of the measurement results, and this is known 
as non-adaptive sparse recovery. We note that Indyk, Price and Woodruff [IPWll] showed that 
adaptive sparse recovery can outperform non-adaptive measurements in certain regimes. Much of 
the existing work also focuses on efficiency concerns, and various algorithms coming from convex 
programming have been used. As such, these algorithms tend to be stable under noise, and can 
recover approximations to the sparse vector (and can even do so only if the original vector was 
approximately sparse). One of the initial achievements in this field is an efficient algorithm for 
recovery of a fc-sparse^ approximation of n-entry vector in 0{klog{n/k)) measurements [CRT05]. 

Analogous questions for low-rank recovery have also been explored (for example, see [In] and 
references there in). Initial work (such as [CT09, CP09]) asked the question of low-rank matrix 
completion, where entries of a low-rank matrix M are revealed individually (as opposed measuring 
linear combinations of matrix entries) . It was shown in these works that for an n X n rank < r matrix 
that O(nrpolylogn) noisy samples suffice for nuclear-norm minimization to complete the matrix 
efficiently. Further works (such as [EKPll]) prove that a randomly chosen set of measurements 
(with appropriate parameters) gives enough information for low-rank recovery, other works (such 
as [CPU, RFPIO]) giving explicit conditions on the measurements that guarantee that the nuclear 
norm minimization algorithm works, and finally other works seek alternative algorithms for certain 
ensembles of measurements (such as^ [KOHll]). As in the sparse recovery case, most of these 
work seek stable algorithms that can deal with noisy measurements as well as matrices that are 
only approximately low-rank. Finally, we note that some applications (such as quantum state 
tomography) have additional requirements for their measurements (for example, they should be 
easy to prepare as quantum states) and some work has gone into this as well [GLF^IO, Gro09]. 

We now make a crucial observation which shows that black-box PIT for the quadratic form /m 
is actually very closely related to low-rank recovery of M. That is, note that /M(x,y) = x+My = 
(M, x^^y). That is, an evaluation of /m corresponds to a measurement of M, and in particular this 
measurement is realized as a rank-1 matrix. Thus, we see that any low-rank-recovery algorithm 
that only uses rank-1 measurement can also determine if M is non-zero, and thus also performs 
PIT for quadratic forms. Conversely, suppose we have a black-box PIT algorithm for rank < 2r 
quadratic forms. Note then that for any M,N with rank < r, M — N has rank < 2r. Thus, if 
M ^ N then Jm-n will evaluate to non-zero on some point in the hitting set. As Jm-n = /m — In, 
it follows that a hitting set for rank < 2r matrices will distinguish M and A^. In particular, this 
shows that information-theoretically any hitting set for rank < 2r matrices is also an LRR set. 
Thus, in addition to constructing hitting sets for the quadratic forms /m, this paper will also use 
those hitting sets as LRR sets, and also give efficient LRR algorithms for these constructions. 

1.3 Rank-Metric Codes 

Most existing work on LRR has focused on random measurements, whereas the interesting aspect 
of PIT is to develop deterministic evaluations of polynomials. As the main motivation for this 
paper is to develop new PIT algorithms, we will seek deterministic LRR schemes. Further, we will 
want results that are field independent, and so this work will focus on noiseless measurements (and 
matrices that are exactly of rank < r). In such a setting, LRR constructions are very related to rank- 
metric codes. These codes (related to array codes), are error-correcting codes where the messages 
are matrices (or tensors) and the normal notion of distance (the Hamming metric) is replaced by 
the rank metric (that is, the distance of matrices M and is rank(M — A^)). Over matrices, these 

vector is fc-sparse if it has at most k non-zero entries. 
Interestingly, [KOHll] use what they call subspace expanders a notion that was studied before in a different 
context in theoretical computer science and mathematics under the name of dimension expanders [LZ08, DS08]. 
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codes were originally introduced independently by Gabidulin, Delsarte and Roth [GK72, Gab85b, 
Gab85a, Del78, Rot91]. They showed, using ideas from BCH codes, how to get optimal (that 
is, meeting an analogue of the Singleton bound) rank-metric codes over matrices, as well as how 
to decode these codes efficiently. A later result by Meshulam [Mcs95] constructed rank-metric 
codes where every codeword is a Hankel matrix. Roth [Rot91] also showed how to construct rank- 
metric codes from any hamming-metric code, but did not provide a decoding algorithm. Later, 
Roth [Rot96] considered rank-metric codes over tensors and gave decoding algorithms for a constant 
number of errors. Roth also discussed analogues to the Gilbert- Varshamov and Singleton bounds in 
this regime. This alternate metric is motivated by crisscross errors in data storage scenarios, where 
corruption can occur in bursts along a row or column of a matrix (and are thus rank-1 errors). 

We now explain how rank-metric codes are related to LRR. Suppose we have a set of matrices Ti 
which form a set of (non-adaptive, deterministically chosen) LRR measurements that can recover 
rank < r matrices. Define the code C as the set of matrices orthogonal to each matrix in T-L. Thus, 
C is a linear code. Further, given some M € C and E such that rank(i?) < r, it follows that 
H{M + E) = T-LE (where we abuse notation and treat M and E as n^-long vectors, and T-L as an 

X matrix). That T-L is an LRR set means that E can be recovered from the measurements 
1-LE. Thus the code C can correct r errors (and has minimum distance > 2r + 1, by a standard 
coding theory argument, as encapsulated in Lemma 8.4). Similarly, given a rank-metric code C 
that can correct up to rank < r errors, the parity checks of this code define an LRR scheme. Thus, 
a small LRR set is equivalent to a rank-metric code with good rate. 

The previous subsection showed the tight connection between LRR and PIT. Via the above 
paragraph, we see that hitting sets for quadratic forms are equivalent to rank-metric codes, when 
the parity check constraints are restricted to be rank 1 matrices. 

1.4 Reconstruction of Arithmetic Circuits 

Even more general than the PIT and LRR problems, we can consider the problem of reconstruction 
of general arithmetic circuits only given oracle access to the evaluation of that circuit. This is the 
arithmetic analog of the problem of learning a function using membership queries. For more 
background on reconstruction of arithmetic circuits we refer the reader to [SYIO]. Just as with the 
PIT and LRR connection, PIT for a specific circuit class gives information-theoretic reconstruction 
for that circuit class. As we consider the PIT question for tensors, we can also consider the 
reconstruction problem. 

The general reconstruction problem for tensors of degree d and rank r was considered before 
in the literature [BBV96, BBB+00, KS06] where learning algorithms were given for any value of 
r. However, those algorithms are inherently randomized. Also of note is that the algorithms of 
[BBB+00, KS06] output a multiplicity automata, which in the context of arithmetic circuits can 
be thought of as an arithmetic branching program. In contrast, the most natural form of the 
reconstruction question would be to output a degree d tensor. 

1.5 Our Results 

In this subsection we informally summarize our results. We again stress that our results handle 
matrices of exactly rank < r, and we consider non-adaptive, deterministic measurements. The cul- 
minating result of this work is the connection showing that low-rank recovery reduces to performing 
sparse-recovery, and that we can use dual Reed-Solomon codes to instantiate the sparse-recovery 
oracle to achieve a low-rank recovery set that only requires rank-1 (or even sparse) measurements. 
We find the fact that we can transform an algorithm for a combinatorial property (recovering sparse 
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signals) to an algorithm for an algebraic property (recovering low-rank matrices) quite interesting. 

Hitting Sets for Matrices and Tensors We begin with constructions of hitting sets for matri- 
ces, so as to get black box PIT for quadratic forms. By improving a construction of rank-preserving 
matrices from Gabizon-Raz [GR08], we are able to show the following result, which we can then 
leverage to construct hitting sets. 

Theorem (Theorem 5.1). Let n > r > 1. Let ¥ he a "large" field, and let g ^ ¥ have "large" 
multiplicative order. Let M he an n x. n matrix of rank < r over ¥. Let fM{x,y) = xtMy be the 
bivariate polynomial defined by the vectors x € F"" and y G F" such thai? (x)j = and (y)j = y*. 

Then M is non-zero iff one of the univariate polynomials fM{x, x), fM{x, gx), . . . , fuix, g^'~^x) 
is non-zero. 

Intuitively this says that we can test if the quadratic form /a/ is zero by testing whether each 
of r univariate polynomials are zero. As these univariate polynomials are of degree < 2n, it follows 
that we can interpolate them fully using 2n evaluations. As such a univariate polynomial is zero 
iff all of these evaluations are zero, this yields a 2nr sized hitting set. While this only works for 
"large" fields, we can combine this with results on simulation of large fields (see Section 6.3) to 
derive results over any field with some loss. This is encapsulated in the next results for black-box 
PIT, where the log factors are unnecessary over large fields. 

Theorem (Corollaries 6.13 and 6.17). Let n > r > I. Let F be any field, then there is a poly(n)- 
explicit^ hitting set for n x n matrices of rank < r, of size 0(nr Ig^ n). 

Theorem (Corollary 6.18). Let n,r > 1 and d > 2. Let F be any field, then there is a 
po\y{{nd)'^,r^^'^)- explicit hitting set for [n]'^ tensors of rank < r, of size 0{dnr^^'^ ■ {d\g{nd))'^) . 

If F is large enough then the 0{{d\g{nd))'^) term is unnecessary. In such a situation, this 
is a quasi-polynomial sized hitting set, improving on the min(n'^, poly((n(i)^)) sized hitting set 
achievable by invoking the best known results for SnS(A;) circuits [SSll]. However, this hitting 
set is not as explicit as the construction of [SSll] since it takes at least n*^ time to compute, as 
opposed to poly(n, d, r). Nevertheless, although it takes poly((n(i)'^, r's"^) time to construct the set, 
the fact that it is of quasi-polynomial size is quite interesting and novel. Indeed, in general it is 
not clear at all how to construct a quasi-polynomial sized hitting set for general circuits (or just for 
depth-3 circuits), when one is allowed even an exp(nci) construction time (where n is the number 
of variables, and d is the degree of the output polynomial). We note that this result improves on 
the two obvious hitting sets seen in Lemmas 3.11 and 3.13. The first gives n'^ tensors in the hitting 
set and is polylog(n, d, r)-explicit while the second gives a set of size ~ dnr while not being explicit 
at all. The above result non-trivially interpolates between these two results. Finally, we mention 
that in Remark 6.9 we explain how one can achieve (roughly) a poly(r((in)^)-constructible hitting 
set of the same size. As this is a somewhat mild improvement (this is still not the explicitness that 
we were looking for) we only briefly sketch the argument. 

Low-Rank Recovery As mentioned in the previous section, black-box PIT results imply LRR 
constructions in an information theoretic sense. Thus, the above hitting sets imply LRR con- 
structions but the algorithm for recovery is not implied by the above result. To yield algorithmic 

■^In this paper, vectors and matrices are indexed from zero, so x = (1, x, . . . . , a;""^)^. 

■*A n X n matrix is t-explicit if each entry can be (deterministicaUy) computed in t steps, where field operations 
are considered unit cost. 
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results, we actually establish a stronger claim. That is, we first show that the above hitting sets 
embed a natural sparse-recovery set arising from the dual Reed-Solomon code. Then we develop 
an algorithm that shows that any sparse-recovery set gives rise to a low-rank-recovery set, and 
that recovery can be performed efficiently given an oracle for sparse recovery. This connection (in 
the context that any error-correcting code in the hamming metric yields an error-correcting code 
in the rank-metric) was independently made by Roth [Rot91] (see Theorem 3), who did not give a 
recovery procedure for the resulting LRR scheme. The next theorem, which is the main result of 
the paper, shows this connection is also efficient with respect to recovery. 

Theorem (Theorem 7.19). Let n > r > 1. Let V be a set of (non-adaptive) measurements for 
2r -sparse-recovery for n-long vectors. Then there is a po\y (n) -explicit set H, which is a (non- 
adaptive) rank < r low-rank-recovery set for nx n matrices, with a recovery algorithm running in 
time 0{rn'^ + nr), where r is the amount of time needed to do sparse-recovery from V. Further, 
\T-L\ = 2n\V\, and each matrix in % is n-sparse. 

This result shows that sparse-recovery and low-rank recovery (at least in the exact case) are 
very closely connected. Interestingly, this shows that sparse-recovery (which can be regarded as a 
combinatorial property) and low-rank recovery (which can be regarded as an algebraic property) 
are tightly connected. Many fruitful connections have taken this form, such as in spectral graph 
theory, and perhaps the connection presented here will yield yet further results. 

Also, the algorithm used in the above result is purely linear-algebraic, in contrast to the convex 
optimization approaches that many compressed sensing works use. However, we do not know if 
the above result is stable to noise, and regard this issue as an important question left open by this 
work. 

When the above result is combined with our hitting set results, we achieve the following LRR 
scheme for matrices (and an LRR scheme for tensors, with parameters similar to Corollary 6.18 
mentioned above, and Corollary 8.6 mentioned below, is derived in Corollary 8.2). 

Theorem (Corollary 7.26). Let n > r > 1. Over any field ¥, there is an poly (n)- explicit set %, of 
0{rn\^ n) size, such that measurements against H allow recovery of n x n matrices of rank < r 
in time poly(n). Further, the matrices in % can he chosen to he all rank 1, or all n-sparse. 

We note again that over large fields these logarithmic factors are seen to be unneeded. 

Some prior work [GK72, Gab85b, Gab85a, Del78, Rot91] on LRR focused on finite fields, and 
as such based their results on BCH codes. The above result is based on (dual) Reed-Solomon codes, 
and as such works over any field (when combined with results allowing simulation of large fields by 
small fields). Other prior work [RFPIO] on exact LRR permitted randomized measurements, while 
we achieve deterministic measurements. 

Further, we are able to do LRR with measurements that are either all n-sparse, or all rank- 
1. As Roth [Rot91] independently observed, the n-sparse LRR measurements can arise from any 
(hamming-metric) error-correcting code (but he did not provide decoding). Tan, Balzano and 
Draper [TBDll] showed that random (n Ig n)-sparse measurements provide essentially the same 
low-rank recovery properties as random measurements. Thus, our results essentially achieve this 
deterministically. 

We further observe that a specific code (the dual Reed-Solomon code) allows a change of basis 
for the measurements, and in this new basis the measurements are all rank 1. Recht et al. [RFPIO] 
asked whether low-rank recovery was possible when the measurements were rank 1 (or "factored"), 
as such measurements could be more practical as they are simpler to generate and store in memory. 
Thus, our construction answers this question in the positive direction, at least for exact LRR. 
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Rank-Metric Codes Appealing to tlie connection between LRR and rank-metric codes, we 
acliieve the following constructions of rank- metric codes. 

Theorem (Corollary 8.5). Let F be any field, n > 1 and 1 < r < n/2. Then there are poly(n)- 
explicit rank-metric codes with poly {n) -time decoding for up to r errors, with parameters [[n]^, (n — 
2r)^ • C'(lg^n),2r + 1]f, and the parity checks on this code can he chosen to he all rank-1 matrices, 
or all n-sparse matrices. 

Earlier work on rank-metric codes over finite fields [GK72, Gab85b, Gab85a, Dcl78, Rot91] 
achieved [[n]^,n(n — 2r),2r + 1]^^ rank-metric codes, with efficient decoding algorithms. These 
are optimal (meeting the analogue of the Singleton bound for rank-metric codes). However, these 
constructions only work over finite fields. While our code achieves a worse rate, its construction 
works over any field, and over infinite fields the O(lg^n) term is unneeded. Further, Roth [Eotfjl] 
observed that the resulting [[n]^, (n — 2r)^,2r + 1] code is optimal (see discussion of his Theorem 
3) over algebraically closed fields (which are infinite). 

We are also able to give rank-metric codes over tensors, which can correct errors up to rank 
~ ^d./hd ^ou^ of a maximum n'^~^), while still achieving constant rate. The rank- metric code 
arising from the naive low-rank recovery of Lemma 3.11 never achieves constant rate, and prior 
work by Roth [Rot96] only gave decoding against a constant number of errors. 

Theorem (Corollary 8.6). Let¥ be any field, n,r>l and d > 2. Then there are po\y{{nd)'^,r^^'^)- 
explicit rank-metric codes with poly {{nd)'^,r^^'^) -time decoding for up to r errors, with parameters 
[[n]^,n^ - 0{d'^nr^^'^lg{dn)),2r + 1]f. 

We note here that our decoding algorithm will return the entire tensor, which is of size n'^. 
Trivially, any algorithm returning the entire tensor must take at least n"^ time. In this case, the 
level of explicitness of the code we achieve is reasonable. However, a more desirable result would 
be for the algorithm to return a rank < r representation of the tensor, and thus the n*^ lower 
bound would not apply so that one could hope for faster decoding algorithms. Unfortunately, even 
for d = 3 an efficient algorithm to do so would imply P = NP. That is, if an algorithm (even 
one which is not a rank-metric decoding or low-rank recovery algorithm) could produce a rank < r 
decomposition for any rank < r tensor, then one could compute tensor-rank by as it is the minimum 
r such that the resulting rank < r decomposition actually computes the desired tensor (this can 
be checked in poly(?i'^) time). However, Hastad [Iias90] showed that tensor-rank (over finite fields) 
is NP-hard for any fixed d > 3. It follows that for any (fixed) d > 3, if one could recover (even in 
poly(n'^)-time) a rank < r tensor into its rank < r decomposition, then P = NP. Thus, we only 
discuss recovery of a tensor by reproducing its entire list of entries, as opposed to its more concise 
representation. 

Finally, we remark that in [Rot96] Roth discussed the question of decoding rank-metric codes of 
degree d = 3, gave decoding algorithms for errors of rank 1 and 2, and wrote that "Since computing 
tensor rank is an intractable problem, it is unlikely that we will have an efficient decoding algorithm 
. . . otherwise, we could use the decoder to compute the rank of any tensor. Hence, if there is any 
efficient decoding algorithm, then we expect such an algorithm to recover the error tensor without 
necessarily obtaining its rank. Such an algorithm, that can handle any prescribed number of errors, 
is not yet known." Thus, our work gives the first such algorithm for tensors of degree d > 2. 

1.6 Proof Overview 

In this section we give proof outlines of the results mentioned so far. 
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Hitting Sets for Matrices The main idea for our hitting set construction is to reduce the 
question of hitting (non-zero) n x n matrices to a question of hitting (non-zero) r x r matrices. 
Once this reduction is performed, we can then run the naive hitting set of Lemma 3.11, which 
queries all entries. This can loosely be seen in analogy with the kernelization process in fixed- 
parameter tractability, where a problem depending on the input size, n, and some parameter, k, 
can be solved by first reducing to an instance of size f{k), and then brute-forcing this instance. 

To perform this kernelization, we first note that any n x n matrix M of rank exactly r can be 
written as M = PQ^ where P and Q are n x r matrices of rank exactly r. To reduce M to an 
r X r matrix, it thus suffices to reduce P and Q each to r x r matrices, denoted P' and Q' . As 
this reduction must preserve the fact that M is non-zero, we need that P'Q' 7^ 0. We enforce this 
requirement by insisting that P' and Q' are also rank exactly r, so that M' = P'Q' is also non-zero. 

To achieve this rank-preservation, we turn to a lemma of Gabizon-Raz [GR08] (we note that 
this lemma has been used before for black-box PIT [KS08, SSll]). They gave an explicit family 
of 0(nr^)-many r x n-matrices {A^}^, such that for any P and Q of rank exactly r, at least one 
matrix Ai from the family is such that rank{A£P) = rank(^£(5) = r. Translating this result into 
our problem, it follows that one of the r x r matrices AgMA^^ is full-rank. The (i,j)-th entry of 
AiMA^^ is (M, (Af)j(^£)t), where {A£)i is the i-th row of Ai. It follows that querying each entry in 
these r X r matrices corresponds to a rank 1 measurement of M, and thus make up a hitting set. 
As there were O(nr^) choices of i and choices of (i,j), this gives a 0(nr^)-sized hitting set. 

To achieve a smaller hitting set, we use the following sequence of ideas. First, we observe that 
in the above, we can always assume i = 0. Loosely, this is because is always full-rank, or 

zero. Thus, only the first row of AeMAl needs to be queried to determine this. Second, we improve 
upon the Gabizon-Raz lemma, and provide an explicit family of rank-preserving matrices with size 
0{nr). This follows from modifying their construction so the degree of a certain determinant is 
smaller. To ensure that the determinant is a non-zero polynomial, we show that it has a unique 
monomial that achieves maximal degree, and that the term achieving maximal degree has a non- 
zero coefficient as a Vandermonde determinant (formed from powers of an element 5, which has 
large multiplicative order) is non-zero. Finally, we observe that the hitting set constraints can 
be viewed as a constraints regarding polynomial interpolation. This view shows that some of the 
constraints are linearly-dependent, and thus can be removed. Each of the above observations saves 
a factor of r in the size of the hitting set, and thus produces an C'(nr)-sized hitting set. 

Low- Rank Recovery Having constructed hitting sets. Lemma 3.10 implies that the same con- 
struction yields low-rank-recovery sets. As this lemma does not provide a recovery algorithm, we 
provide one. To do so, we must first change the basis of our hitting set. That is, the hitting set 
B yields a set of constraints on a matrix M, and we are free to choose another basis for these 
constraints, which we call T). The virtue of this new basis is that each constraint is non-zero only 
on some A;-diagonal (the entries such that i + j = k). It turns out that these constraints 

are the parity checks of a dual Reed-Solomon code with distance 0(r). This code can be decoded 
efficiently using what is known as Prony's method [dP95], which was developed in 1795. We give an 
exposition in Section 7. 1 , where we show how to syndrome-decode this code up to half its minimum 
distance, counting erasures as half-errors. Thus, given a G(r)-sparse vector (which can be thought 
of as errors from the vector 0) these parity checks impose constraints from which the sparse vector 
can be recovered. Put another way, our low-rank-recovery set naturally embeds a sparse-recovery 
set along each /c-diagonal. 

Thus, in designing a recovery algorithm for our low-rank recovery set, we do more and show how 
to recover from any set of measurements which embed a sparse-recovery set along each A;-diagonal. 
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In terms of error-correcting codes, this shows that any hamming-metric code yields a rank-metric 
code over matrices, and that decoding the rank-metric code efficiently reduces to decoding the 
hamming-metric code. 

To perform recovery, we introduce the notion of a matrix being in (< A;)-upper-echelon form. 
Loosely, this says that M^^^\ the entries of the matrix with i + j < k, are in row-reduced 

echelon form. We then show that for any matrix M in (< /c)-upper-echelon form, the /c-diagonal 
is 2 rank(M)-sparse. As an example, suppose M^^^^ was entirely zero. It follows then that M is 
in (< A;)-upper-echelon form. Further, the rows that have non-zero entries on the fc-diagonal of M 
are then linearly-independent, as they form a triangular system. It follows that the /c-diagonal can 
only have rank(iVf) non-zero entries. The more general case is slightly more complicated technically, 
but not conceptually. Thus, this echelon- form translates the notion of low-rank into the notion of 
sparsity. 

The algorithm then follows naturally. We induct on k, first putting M^^^'^ into (< A;)-upper- 
echelon form (using row-reduction), and then invoking a sparse-recovery oracle on the fc-diagonal 
of M to recover it. This then yields M'^^^\ and we increment k. However, as described so far, 
the use of the sparse-recovery oracle is adaptive. We show that the row-reduction procedure can 
be understood such that the adaptive use of the sparse-recovery oracle can be simulated using 
non-adaptive calls to the oracle. More specifically, we will apply the measurements of the sparse- 
recovery oracle on each fc-diagonal of M (which may not be sparse) , and show how to compute the 
measurements of the adaptive algorithm (where the /c-diagonals are sparse) from the measurements 
made. Putting these steps together, this shows that exact non-adaptive low-rank-recovery reduces 
to exact non-adaptive sparse-recovery. Instantiating this claim with our hitting sets from above 
gives a concrete low-rank-recovery set, with accompanied recovery algorithm. 

Hitting Sets and Low-Rank Recovery for Tensors The results for matrices naturally gen- 
eralize to tensors in the sense that an [np'^ tensor can be viewed as an [n'^p matrix. How- 
ever, we can do better. Specifically, the hitting set results were done via variable reduction, as 
encapsulated by Theorem 5.1, which shows that a rank < r bivariate polynomial fM{x,y) = 
(1, j;, j;^, . . . , x"~^)A/(l, y, y^, . . . , y""-*^)"!^ is zero iff a set of r univariate polynomials are all zero. 
Further, the degrees of these polynomials is only twice the original degree. As each univariate 
polynomial can be interpolated using 0{n) measurements, this yields 0{nr) measurements total. 
This motivates the more general idea of treating a degree d tensor as a d-variate polynomial, and 
showing that we can test whether this polynomial is zero by testing if a collection of d'-variate 
polynomials are zero, for d' < d. Recursing on this procedure then reduces the d-variate case to 
the univariate case, and the univariate case is brute-force interpolated. 

The recursion scheme we develop for this is to show that a d-variate polynomial is zero iff r 
(i/2-variate polynomials are zero, and this naturally leads to an 0{dnr^^ '^)-sized hitting set. To 
prove its correctness, we show that the bivariate case (corresponding to matrices) applied to two 
groups of variables allows us to reduce to a single group of variables (with an increase in the number 
of polynomials to test). Finally, since we saw how to do low-rank recovery for matrices, and the 
tensor-case essentially only uses the matrix case, we can also turn this hitting set procedure into a 
low-rank recovery algorithm. 

Simulation of Large Fields by Small Fields Most all of the results mentioned require a field 
of size w poly(n'^). When getting results over small fields, we show that, with some loss, we can 
simulate such large fields inside the hitting sets. We break-up each tensor H in the original hitting 
set into new tensors Hi such that for any F-tensor T, (T, H) can be reconstructed from the set of 
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values {{T, Hi)}i. To do so, we use the well-known representation of a extension field K of F as a 
field of matrices over F. As the entries of a rank-1 tensor are multiplications of d elements of IK, we 
can expand these multiplications out as iterated matrix multiplications, which yields (dim^ K)'^"'"^ 
terms to consider, each of which corresponds to some Hi. 

Rank-Metric Codes The above techniques give the existence of low-rank-recovery sets (and 
corresponding algorithms) for tensors, over any field. Via the connections presented in Section 1.3, 
this readily yields rank-metric codes with corresponding parameters. 

2 Notation 

We now fix some notation. For a positive integer n we denote [n]*== {1, . . . ,n} and [nj =^{0, . . . , n — 
1}. We use (^) to denote the set of all subsets of S of size k. Given a set S of integers, we denote 

n — S'={n — s : s £ S}. All logarithms will be base 2. Given a polynomial / G F[xi, . . . ,Xm\, 
deg(/) will denote the total degree of /, and deg^.. (/) will denote the individual degree of / in the 
variable Xj. That is, the polynomial xy has total degree 2 and individual degree 1 in the variable x 
and individual degree in the variable z. Given a monomial x°, Cx"(/) will denote the coefficient 
of in the polynomial /. 

Vectors, matrices, and tensors will all begin indexing from 0, instead of from 1. The number 
n will typically refer to the number of rows of a matrix, and m the number of columns. will 
denote the nx n identity matrix. Denote Eij to be the n x n square matrix with its (i, j)-th entry 
being 1, and all other entries being zero. A vector is /c-sparse if it has at most k non-zero entries. 
Given a matrix A, will denotes its transpose. Given a vector x G F", |x|'==n. 

A list of n values in F is t -explicit if each entry can be computed in t steps, where we allow 
operations in F to be done at unit cost. 

Frequently throughout this paper we will divide a matrix into its diagonals, which we define 
as the entries (i, j) where i+ j is constant. The following notation will make this discussion more 
convenient. 

Notat ion 2.1. Let M be an nxm matrix. The k-diagonal of M is the set of entries {Mi j^i^j—i^. 
The (< k)-diagonals of M is the set of entries {Mjj}j+j<fc. The (< k)-diagonals of M is the 

set of entries {Mjj}j+j<;fc 

M^^\ M^-^^ and M^^^'^ will denote the k-diagonal, (< k) -diagonals and (< k)-diagonals of M , 
respectively. 

This notation will be frequently abused, in that a diagonal will refer to a set of positions in a 
matrix in addition to referring to the values in those positions. However, the main diagonal of a 
matrix will refer to the entries of that matrix. 

3 Preliminaries 

In this section we formally define tensors as well as the PIT and LRR problems. We first discuss 
tensors, and their notion of rank. Rank-metric codes will be defined and discussed in Section 8. 
Recall that we index starting at 0, so we will use the product space [nj'^ instead of [n]'^ for the 
domains of tensors. 

Definition 3.1. A tensor over a field ¥ is a function T : Ilj=ill^il ~^ ^- -^^^^ have degree 
d and size (ni, . . . ,72^). // all of the nj are equal to n, then T is said to have size In^. 
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Given two tensor Ti,T2 of size 11^=1 I^jL (ri,T2) = Zi^eK] ■ ■ • '^rf)^2(n, ■■■,id)- 

Note that the above inner product is the natural inner product when regarding a 11^=1 I^il 
tensor as a vector of dimension rij=i'^i- define the notion of rank. Loosely, a tensor is 

rank 1 if it can be "factored" along each dimension, and a tensor is rank < r if it can be expressed 
as the sum of < r rank 1 tensors. 

Definition 3.2. A tensor T : Y['j=ll''^j^ — ^ IF is rank-one if for j G [d] there are vectors Vj € 
\ {0} such that T = (^'j^i'Vj. That is, for all ij € [nj], T{ii, . . . ^id) = 11^=1 "^jih) "u^here Vj{ij) 

denotes the ij-th coordinate ofvj. 

The rank of a tensor T : 11^=1 1'^jl ~^ defined as the minimum number of terms in a 

summation of rank-1 tensors expressing T , that is, 

rankF(r) = min |r : T = ^ ®j=iVj,^; e F'^J | . 

As one might hope, when d = 2 the above definitions reduce to the definition of a matrix, and 
matrix-rank, respectively. Further, the inner-product is then their Probenius inner product. That 
is, (Mi,M2) = Trace(MiM|). 

We now define the polynomial of a tensor. 

Definition 3.3. Let T : 11^=1 1'^jl ¥ be a tensor, and let xi, . . . ,X£; be vectors of variables, so 
Xj = {xjfi, . . . , Xj^rij-i) for all j G [d] . Then define 

D 

/t(xi, . . . ,Xd) = ^ r(ii,...,id) JJxj,,^ = (r,xi (g)---(g)Xd) , 
and define the d-variate polynomial 

D 

fT{xi,...,Xd)= ^ r(ii,...,irf) JJx*' = /T(xi,...,Xd) , 

ijeKI j=i 

where (xj)j=^x*-. 

Note that the second equality in the first equation of the above definition follows from the 
definition of the inner product over tensors. As a matrix M is also a tensor, we will also use 
this notation when considering the polynomial /jv./(x, y)'==x1^My, as the above definition readily 
generalizes the notion of a quadratic form. Note that fx allows us to consider any d-variate 
polynomial to be a tensor, and the rank of such a polynomial will simply be the rank of the 
corresponding tensor. 

We now show the connection of these polynomials /t to set-multilinear depth-3 circuits. We 
do not seek to define all of the relevant terms in this notion, and instead refer the reader to the 
recent survey [SYIO], and will simply define the subclass we are interested in. 

Definition 3.4. For j G [d], let Xj = {xj^o, . . . ,Xj^n-i) vectors of variables. A degree d, set- 
multilinear, SnS circuit with top fan-in r, is a polynomial of the following form 

r d 

C(xi, . . . ,Xd) = ^ JJ(Vj-£,Xj) 

1=1 j=l 

where each vj/ G F". 
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We now see the following connection between these circuits and tensors. 

Lemma 3.5. The polynomials computed by degree d set-multilinear SIIS circuits, with top fan-in 

< r, on dn variables, are exactly the polynomials fx, for tensors T : — )• F with rank < r. 

Proof. : Suppose T is of rank < r, so T = Yl^e=i ^j=i'^j.e for Vj^£ G F". Then fT = {T, xi®- • -(X" 

Xd) = Z]£=i(®j=iVj,£, xi (g) • • • (g) X(i) = X]£=i n^=i(vj,£, Xj), and this final polynomial is computed 
as a set-multilinear SIIS circuit. 

=^ : This argument is simply the reverse of the above. □ 

We also get the following result for the polynomial fx- 

Lemma 3.6. For T : [n]'' — )• F with rank < r, . . . , x^) = '^l=iYl'j=iPj,ii^j)! where 

deg pj^i < n. 

Proof As T is rank < r, T = ^Li ®j=iVi,^ for v^-^ G F". Then /r = /tI^i, • • • , id) = 
YJ'i=i nj=i(vi/, Xj). Taking pj^^(xj) =^(vj^£, Xj) yields the result. □ 

Recall that, as discussed in the introduction, set-multilinear SIIS circuits have a white-box 
polynomial-time PIT algorithm due to Raz and Shpilka [RS05] but no known polynomial-sized 
black-box PIT algorithm. By the above connection, this is the same as creating hitting sets for 
tensors, which we will now define. 

Definition 3.7. Let K be an extension of¥. A hitting set % for nj=ill'^il tensors of rank 

< r over ¥ is a set of points % C 11^=1 (^"^ ) such that for any T : nj=ill'b'l ~^ ^ Tarik <r,T 
is a non-zero iff there exists (ai, . . . , a^) ^T-L such that /T(ai, . . . , a^) 7^ 0. 

However, we saw in Definition 3.3 that evaluating fx is equivalent to taking an inner product 
of T with a rank-1 tensor. This leads to the following equivalent definition. 

Definition 3.8 (Reformulation of Definition 3.7). Let K be an extension of¥. A hitting set % 
for 11^=1 1'^il tensors of rank < r over ¥ is a set of rank-1 tensors % C ]Knj=iI"3]l such that for 
any T : nj'=ill'^il ^ of rank < r, T is a non-zero iff there exists H £ 7i such that {T, H) / 0. 

// % instead is not constrained to consist of rank-1 tensors, then we say % is an improper 
hitting set. 

As is common in PIT literature, we allow the use of the extension field K, and in our case 
|K| < poly(|F|) will be sufficient. However, the results of Section 6.3 will show how to remove the 
need for K. from our results (with some loss). 

We now define our notion of a low-rank recovery set, extending Definition 3.8. Note that we 
drop here the restriction that the tensors must be rank 1. 

Definition 3.9. A set of tensors TZ C ]Kni=iI"j]l is an r -low-rank-recovery set if for every 
tensor T : 11^=1 I^jl ~^ ^ with rank < r, T is uniquely determined by y, where y G IK^ is defined 

by yR = {T,R), for Ren. 

An algorithm performs recovery from TZ if, for each such T, it recovers T given y. 

We now show that, despite low-rank recovery being a stronger notion than a hitting set, hitting 
sets imply low-rank recovery with some loss in parameters, as seen by the following lemma. 
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Lemma 3.10. IfH is a (proper or improper) hitting-set for YYj^ilnjJ tensors of rank < 2r, then 
% is an r -low-rank-recovery set for Yl'j^ilnj} tensors also. 

Proof. Let A,B£ ]pnj=i["jl ]-,g tensors of rank < r such that their inner products with the 
tensors in H are the same. By Hnearity of the inner product, it follows then that the tensor A — B 
has rank < 2r and has zero inner product with each tensor in %. As "H is a hitting set, it follows 
that A — B = and thus A = B. Therefore, tensors of rank < r are determined by their inner 
products with % and thus % is an r-low-rank-recovery set. □ 

We now discuss some trivial LRR results. The first result is the obvious low-rank recovery 
construction, which is extremely explicit but requires many measurements. 

Lemma 3.11. For n > 1, d > 2, there is a po\y\og{n, d,r) -explicit r-low-rank-recovery set for [n]]'^ 
tensors, of size n'^. Further, recovery ofT is possible in poly(n'^) time. 

Proof. For i = (ii, . . . , z^) G [n]], let the rank 1 tensor R\ : Jn]"^ — )• F be the rank 1 tensor, which is 
the indicator function for the set {{ii, ■ ■ ■ , id)}- Thus, (T, Ri) = T{ii, . . . , id). It follows that T = 
iff each such inner product is zero, and further that recovery of T is possible (in poly(n'^) time). 
The explicitness of the recovery set is also clear. □ 

We now will show that, via the probabilistic method, one can show that much smaller low-rank 
recovery sets exist. To do so, we first cite the following form of the Schwartz-Zippel Lemma. 

Lemma 3.12 (Schwartz-Zippel Lemma [SchSO, Zip79]). Let f G ¥[xi, . . . ,Xm] be a non-zero poly- 
nomial of total degree < d, and /S C F. Then PrxGS"»[/(^) = 0] d/\S\. 

We now give a (standard) probabilistic method proof that small hitting sets exist (over finite 
fields). We present this not as a tight result, but as an example of what parameters one can hope 
to achieve. 

Lemma 3.13. Let ¥q be the field on q elements. Let n > 1 and q > d > 2. Then there is a hitting 
set for [nj'^ tensors of rank < r, of size < dnr / log^{q / d) + 1 ~ dnr. Further, there is an r -low-rank 
recovery set of size < 2dnr / logq{q / d) + 2. 

Proof. For any non-zero tensor T : [[n]"^ — )• F, fx has degree d, and thus by the Schwartz-Zippel 
Lemma, for a random a G F^, fri^) = with probability at most d/q. There are at most q'^'^^ 
such non-zero tenors. By a union bound, it follows that k random points are not a hitting set for 
rank < r tensors with probability at most q'^"'^ {d / q)^ , which is < 1 if /c > dnr / log ^{q / d) . The 
low-rank-recovery set follows from Lemma 3.10. □ 

We now briefly remark on the tightness of the above result. The general case of tensors is not 
well understood, as it is not well-understood how many tensors there are of a given rank. For 
matrices, the situation is much more clear. In particular, Roth [Rot91] showed (using the language 
of rank-metric codes) that over finite fields the best (improper) hitting set for n x n matrices of 
rank < r is of size nr, and over algebraically closed fields the best (improper) hitting set is of size 
(2n — r)r. As we will aim to be field independent, the second bound is more relevant, and we indeed 
match this bound (as seem in Theorem 5.10) with a proper hitting set. 

Clearly, the above lemma is non-explicit. However, it yields a much smaller hitting set than the 
n'^ result given in Lemma 3.11. Note that previous work (even for d = 2) on LRR and rank-metric 
codes did not focus on requiring that the measurements are rank-1 tensors, and thus cannot be 
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used for PIT. Given this lack of knowledge, this paper seeks to construct proper hitting sets, and 
low-rank-recovery sets, that are both explicit and small. 

We remark that any explicit hitting set naturally leads to tensor rank lower bounds'^. The follow- 
ing lemma, which can be seen as a special case of the more general results of Heintz-Schnorr [HS80] 
and Agrawal [Agr05], shows this connection more concretely. 

Lemma 3.14. Let T-L be a hitting set for In^'^ tensors of rank < r, such that |?^| < n'^. Then there 
is a poly(n'^, IT-L])- explicit tensor of rank > r. 

Proof. Consider the constraints imposed on a tensor T by the system of equations {T,T-L) = 0. 
There are \%\ constraints and n"^ variables. It follows that there is a non-zero T solving this 
system. By the definition of a hitting set, it follows that rank(T) ^ r. That T is explicit follows 
from Gaussian Elimination. □ 

For d = 2, the above is less interesting, as matrix rank is well understood and we know many 
matrices of high rank. For d > 3, tensor rank is far less understood. For d = 3, the best known 
lower bounds for the rank of explicit tensors, over arbitrary fields, due to Alexeev, Forbes, and 
Tsimerman [AFTll], are 3n — 0{lgn) (over F2, a lower bound of 3.52n is known, essentially due 
to Brown and Dobkin [BD80]). More generally, for any fixed d, no explicit tensors are known with 
tensor rank a;(nl-'^/^-l ). The above lemma shows that constructing hitting sets is at least as hard 
as getting a lower bound on any specific tensor. In particular, constructing a hitting set for [n]]'^ 
tensors of rank < r of size 0{dnr'^) with k < 2 would yield new tensor rank lower bounds for odd 
d, in particular d = 3. Such lower bounds would imply new circuit lower bounds, using the results 
of Strassen [St r73] and Raz [RazlO]. Our results give a hitting set with A; ~ Igd, and we leave open 
whether further improvements are possible. 

We will mention the definitions and preliminaries of rank-metric codes in Section 8. 

3.1 Paper Outline 

We briefiy outline the rest of the paper. In Section 4 we give our improved construction of rank- 
preserving matrices, which were first constructed by Gabizon-Raz [GR08]. In Section 5 we then use 
this construction to give our reduction from bivariate identity testing to univariate identity testing 
(Section 5.1), which then readily yields our hitting set for matrices (Section 5.2). In Section 5.3 we 
show an equivalent hitting set, which is more useful for low-rank-recovery. 

Section 6 extends the above results to tensors, where Section 6.1 reduces d-variate identity 
testing to univariate identity testing, and Section 6.2 uses this reduction to construction hitting 
sets for tensors. Finally, Section 6.3 shows how to extend these results to any field. 

Low-rank recovery of matrices is discussed in Section 7. It is split into two parts. Section 7.1 
shows how to decode dual Reed-Solomon codes, which we use as a sparse-recovery oracle. Sec- 
tion 7.2 shows how to, given any such sparse-recovery oracle, perform low-rank-recovery of matrices. 
Instantiating the oracle with dual Reed-Solomon codes gives our low-rank-recovery construction. 

Section 8 shows how to extend our LRR algorithms to tensors, and how to use these results to 
construct rank-metric codes. Finally, Section 9 discusses some problems left open by this work. 

^This connection, along with the connection to rank-metric codes mentioned earlier, can be put in a more broad 
setting: hitting sets (and thus lower-bounds) for circuits from some class C are in a sense equivalent to C-metric linear 
codes. That is, codes where dist{x,y) is defined as the size of the smallest circuit whose truth table is the string 
X — y. We do not pursue this idea further in this work. 
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4 Improved Construction of Rank-preserving Matrices 



In this section we will give an improved version of the Gabizon-Raz lemma [GR08] on the con- 
struction of rank-preserving matrices. The goal is to transform an r-dimensional subspace living 
in an ?i-dimensional ambient space, to an r-dimensional subspace living in an r-dimensional ambi- 
ent space. We will later show (see Theorem 5.1) how to use such a transformation to reduce the 
problem of PIT for n x m matrices of rank < r to the problem of PIT for r x r matrices of rank 

< r. 

We first present the Gabizon-Raz lemma ([GR08], Lemma 6.1), stated in the language of this 
paper. 

Lemma (Gabizon-Raz ([GR08], Lemma 6.1)). Let 1 < r <n. Let M G F"^'' be of rank r. Define 
Aa G F*"^" by {Aa)ij = a*-'. Then there are < nr'^ values a G F such that rank(^QM) < r. 

Our version of this lemma gives a set of matrices parameterized by a where there are only nr 
values of a that lead to rank(^Q,M) < r. This extra factor of r allows us to achieve an 0{{n + m)r)- 
sized hitting set for matrices instead of a O ( (n + m)r^ )-sized hitting set. We comment more on the 
necessity of this improvement in Remark 5.3. We now state our version of this lemma. Our proof 
is very similar to that of Gabizon-Raz. 

Theorem 4.1. Let 1 < r < n. Let M G F"^^'' be of rank r. Let be a field extending F, and 
let g & M. be an element of order > n. Define G IK^'^" by {Aa)ij = {g^ay . Then there are 

< nr — C^^) < nr values a G K such that rank(j4Q,M) < r. 

Proof. We will now treat a as a variable, and thus refer to A^ simply as A. The matrix AM is an 
r X r matrix, and thus the claim will follow from showing that det(^M) is a non-zero polynomial 
in a of degree < nr — (^^^) . As r > 1, nr — {^~^^) < nr. 

To analyze this determinant, we invoke the Cauchy-Binet formula. 



Lemma (Cauchy-Binet Formula, Lemma A.l). Let m > n > 1. Let A £ F"^^"^ , B £ F""^". For 
S C |?n], let As be the n x \S\ matrix formed from A by taking the columns with indices in S. Let 
Bs be defined analogously, but with rows. Then 

det{AB)= det{As)det{Bs) 



SO that 



For S = {ki, . . . , kr}, 



det(^M) = det(^5)det(M5) 

5e(W) 



det{As) 



{gaf- 
{g'-^af^ 



• • • {af^ 
{gaf- 

■ ■ ■ (g'-^a)^' 

n (/^ -/o 



(9' 



1 



ki \r-l 



1 



kr \r—l 



a 



By assumption the order of g is > n, so the elements {g^)o<k<n are distinct, implying that the 
above Vandermonde determinant is non-zero. 
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Further, we observe that deg^, det(j45) = X^fceS^- As S ( " it follows that J2kes^ — 
Efc=n-r k = nr- (^+^) , and thus deg^ det(^M) < ?ir - (^+^) also. 

We now show det(AM) is not identically zero, as a polynomial in a. We show this by showing 
that there is no cancellation of terms at the highest degree of det(^M). That is, there is a unique 
set S S {^^^) maximizing '^j.^gk subject to det(M5') 7^ 0. This is proven by the following lemma. 

Lemma 4.2. Let m > n > 1. Let M be a n x m matrix of rank n. For S C \m\, denote Ms as 
the n x\S\ matrix formed by taking the columns in M (in order) whose indices are in S. Denote 

w{S)'= "^g^g s. Then there is a unique set S G {^"^^) that maximizes w{S) subject to det(M5) 7^ 0. 

Proof. The proof uses the ideas of the Steinitz Exchange Lemma. That is, recall the following facts 
in linear algebra. If sets 81,82 are both sets of linearly independent vectors, and \8i\ > \82\, then 
there is some v G \ ^2 such that ^2 U {v} is linearly independent. Thus, if 81, 82 are both sets of 
linearly independent vectors and 1 5i | = 1 52 1 then for any w G ^2 \ Si there is a vector v G \ S'2 
such that {82 \ {w}) U {v} is linearly independent. 

Now suppose (for contradiction) that there are two different sets 81,82 C |m]] that maximize 
w{8) over the sets such that det(M5) 7^ 0, so that |5i| = |S'2| = n. Pick the smallest index k 
in the (non-empty) symmetric difference (5*2 \ 5*1) U (^i \ 82)- Without loss of generality suppose 

k ^ 82X81. It follows that there is an index / G Si \ ^2 such that the columns in 83'= {82 \ {k}) U {/} 
are linearly independent (by the Steinitz Exchange Lemma), and thus det(M5g) 7^ as 18^1 = n by 
construction. 

By choice of k and construction of /, A; 7^ / and thus k < I. Thus, w{8s) = w{82) + l — k > w{82). 
However, this contradicts that 52 was a maximizer to w{8) subject to det(Ms') 7^ 0. Thus, the 
assumption of non-unique maximizers is false; there must be a unique maximizer. □ 

Thus det{AM) is a non-zero polynomial of degree < nr — {^~^^) in a, so there are at most that 
many values such that det (AM) =0. □ 

We remark that Lemma 4.2 can be seen as a special case of a more general result about matroids, 
which states that if each element in the ground set has a unique (positive) weight, then there is 
a unique independent set with maximal weight. However, as we index matrix columns starting at 
this general fact does not immediately apply. Rather, we implicitly use that all bases in vector 
matroids have the same number of vectors. In such a case, the weight function can be shifted by 
an additive constant without affecting the property of having a unique maximizer. 

We now extend the above result to the case when the rank of the nxr matrix may be less than 
r. This will be useful when studying hitting sets for rank < r matrices, for then we do not know 
the true rank of the unknown matrix, and only have the bound of "< r" . 

Corollary 4.3. Let I < s < r < n. Let M G F"^*"' be of rank s, for r' > s. Let K be a field 
extending F, and let g M. be an element of order > n. Define G by (Aq,)^ .,- = {g^ay . 

Then there are < nr — {^~^^) < nr values a G K such that the first s rows of A^M have rank < s. 

Proof. Consider M' G F"^* to be a matrix formed from s basis columns of Af . It follows, from 
Theorem 4.1, that there are at most ns — {^~^ ) values of a such that the s x n matrix A'^ has 
rank(^^M') < s. As rank(74M') = rank(AM) holds for any A, there are at most ns — (*^^) many 
values of a such that rank(A^M) < s. Also, as ns — {^~^^) < nr — {^~^^) for s < r < n, it also holds 
that there are < nr — (^~^^) values of a such that rank(A^M) < s. Finally the claim follows by 
observing that, by construction, A'^M is exactly the first s rows of AaM. □ 
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5 Identity Testing for Matrices 



The previous section showed how we can map an r-dimensional subspace of an n-dimensional 
ambient space to an r-dimensional subspace of an r-dimensional ambient space. In this section, 
we will use this map to reduce the PIT problem for rank < r matrices of size n x m to the PIT 
problem from rank < r matrices of size r x r. This will be done by applying the dimension reduction 
twice, once to the rows and once to the columns. Further, the r x r version can be solved in r^ 
evaluations, using the naive approach of Lemma 3.11 in querying each entry in the matrix. When 
phrased this way, one can show that this gives a 0((?i-|-m,)poly(r))-sized hitting set. This reduction 
idea is analogous to the kernelization technique used in fixed-parameter tractability, but we do not 
develop this connection further. While this idea demonstrates the feasibility of the rough bound 
cited above, we actually achieve a 0((n -|- ?n)r)-sized hitting set via tighter analysis. 



5.1 Variable Reduction 

Before giving the hitting set construction and its analysis, we first present the main theorem used 
in the analysis. While its statement seems unrelated to the intuition presented above, the proof 
will exploit this intuition. When interpreting the result, recall that we index entries in matrices 
(and vectors) starting at 0, as well as recalling the definition of Jt from a tensor T. 

Theorem 5.1. Let m > n > r > 1. Let IK be an extension o/F such that G IC has order > m. Let 
M be an n X m matrix of rank < r over F. Then M is non-zero (over ¥) ijf one of the univariate 
polynomials fMix,x),fMix,gx),...,fM{x,g^~^x) is non-zero (overK). 

Proof. ( ) : If M is zero then so must all fAi{x,g^x) be as well. Taking the contrapositive yields 
this direction. 

( =^ ) : Say rank(M) = s. By assumption < s < r. Recall that putting M into reduced 
row-echelon form yields a decomposition M = PQ\ such that P E ^^^^ and Q G F"*^* such that 
rank(P) = rank(Q) = s. We remark that it is crucial for our proof that we have "rank(P) = 
rank(Q) = s" here. Invoking the bound "rank(i-'), rank(Q) < s", which one gets directly via the 
definition of rank of M, is insufficient. 

We now exploit the kernelization idea mentioned above. Consider the matrices G W^"^' and 
Ba G IK''^"^ defined by (^a)ij = {g'^ay and {Ba)i,j = (g^ay ■ Now consider AaP and BaQ, which 

have sizes r x s each. Write them in block notation as ( p// ) ^^^"^ \Q / ^^'^ that P'^ and Q'^ are 
both s X s matrices. 

By our refinement of the Gabizon-Raz lemma [GR08], our Corollary 4.3, it follows that there 
are < nr values of a such that iai\k{P'^) < s and < mr values of a such that rank((5^) < s. By 
the union bound, there are < {n -\- m)r values such that rank(P^) < s or rank(Q^) < s. Let EI be 
an extension field of K, such that |EI| > (n + m)r. It follows that there is some ce G H such that 
rank(P^) = s and rank((5'Q,) = s. Fix this as the value of a, and we now drop a from our notation. 

Via block multiplication we see that 

AMB^ = AP{BQy ={^^^{Q' Q") = (^^q, p,9Q,)j 

As rank(i-'') = s and rank((5') = s, it follows that rank(P'(5') = s. We remark that it is here where 
the naive bound "rank(P), rank(Q) < s" is insufficient, and we crucially use that "rank(P) = 
rank(Q) = s". 
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As iank{P'Q') = s, and P'Q' is an s x s matrix, it follows that some entry in its first row 
(which has index 0, by our notation) is non-zero. As P'Q' is a principal minor of AMB'', it follows 
that some entry in the first row of AMB^ is non-zero. Denote row i of ^ as ^j, and row j of B 
as Bj. As the first row of AMB^ is AqMB"^ , it follows then there is some < £ < r — 1 such that 
AqMB\ / 0. Expanding this evaluation out, we see that 

71— l,m— 1 

AoMB\ = {M,AoB\)= M,^, ■ {Ao),{Be)j 

i=0,j=0 

n—l,m—l 

= ^ MijAo^iBgj 

i=0,j=0 
n—l,m—l 

i=0,j=0 

= fMia,g^a) 

Thus, we see that fM{x,g^x) has a non-zero point over the field H. It follows that it is a non-zero 
polynomial over H. As it has coefficients over K., fMi^^g^^) is non-zero over K as well. □ 

Remark 5.2. We now remark on how to implement the kernelization idea, mentioned in the in- 
troduction to this section, in a more straight-forward sense. One can see that rank(P'Q') = s 
shows that AMB'^ / 0. As AMB'^ is of size r X r, we can then run the naive r^-size hitting set of 
Lemma 3.11 for r x r-sized matrices, which checks each individual entry. Noting that the (i, j)-th 
entry of AMB^ is equal to (M, Aj^j) we see that we can implement this naive hitting set as a 
hitting set for n x m matrices. 

Thus, for each a there are rank-1 matrices to test, and we need at most (n + m)r choices of 
a (where here we assume K is at least this big). It follows that there exists an explicit hitting set 
of size (n + m)r^. 

Remark 5.3. We briefly discuss the necessity of our version of the Gabizon-Raz lemma for the above 
proof. The above proof does not invoke our version of the lemma in the fullest, in the sense that 
the nr bound on the number of "bad" a was only used in the sense that it was a finite bound. 
Thus, given that our version of the lemma "only" improves the nr^ bound of Gabizon-Raz to nr, 
it may be unclear why our version is needed here. 

The crucial use of our version of the lemma is keeping the degree low. That is, if one invoked 
the original Gabizon-Raz lemma, one would result in "M is non-zero iff one of the univariate 
polynomials fui^, x), fM{x, x'^), . . . , fuix, x""') is non-zero". While this is correct, it will lead to a 
larger hitting set as one needs to interpolate r polynomials, each of degree w rn, which will give a 
0(rir^)-sized set instead of the 0(nr)-sized set we are able to achieve. 

We also state an equivalent version of this result, which will be useful for higher-degree tensors. 

Corollary 5.4. Let m > n > r > 1. Over the field ¥, consider the bivariate polynomial f{x,y) = 
Yli=iPii^)'liiy) such that deg(pi) < n and deg(gj) < m for all i. Let K he an extension of¥ such 
that g GM. has order > m. 

Then f is non-zero (over ¥) iff one of the univariate polynomials 
f{x,x),f{x,gx),...,f{x,g^~^x) is non-zero (overK). 
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5.2 The Hitting Set for Matrices 



In this subsection we use the theorem of the last subsection to construct hitting sets for matri- 
ces. First, recah our notion of a hitting set for matrices, as given in Definition 3.8. Now recall 
that Theorem 5.1 shows that for any M of rank < r, M is non-zero iff one of the polynomials in 
{/a^ (x, (/^x)}o<£<r- is non-zero. In the preliminaries it was seen that evaluating one of these poly- 
nomials at a point a is equivalent to taking an inner product (M, A) with a rank-1 matrix A. This 
leads naturally to the following idea: interpolate each of the r polynomials in {/a/ (x, g^a;)}o<^<r- 
As each polynomial is of degree < n + m — 2, this will lead to (n + m — l)r inner products. Then M 
is non-zero iff one of these inner products is non-zero. This is the exact idea, which we now make 
formal. 

Construction 5.5. Let m > n > r > 1. Let K. be an extension o/F such that g £ M. is of order 
> m and aoi • • • > CKn+m-2 £ IK are distinct. Let Bj. ^ G j^nxm ^qjiJ^^^ matrix defined by 

{Bk,e)i,j = alig^Uky , and let Br,n,m'={Bk/}o<e<r,o<k<n+m-2- 

We now give the analysis for this hitting set. 

Theorem 5.6. Let m > n > r > 1. Then Br^n,m> defined in Construction 5.5, has the following 
properties: 

L Br n^m is CL hitting set for n x m matrices of rank < r over F. 
3- \Br,n,m\ = (n + m - l)r 

3. Br n^ra co,n bc computed in poly(m) operations, where operations (including a successor func- 
tion in some enumeration ofM.) over IC are counted at unit cost. 

Proof \Br, n,m\ — {n -\- m — l)r: This is by definition. 

Br,n,m Can be computed in poly(m) operations: We assume here an enumeration of elements in 
IfC such that the successor in this enumeration can be computed at unit cost. We also will assume 
testing whether an element is zero, as well as arithmetic operations in the field, are done at unit 
cost. 

First observe that there are at most m solutions to x"^ — 1 over IK, so if we enumerate m + 1 
elements of K, then we can find a 5 G K with order > m. This is in poly(m) operations. Similarly, 
the enumeration will give us n+m— 1 distinct elements which yield the desired a^. Then, computing 
each Bk i can be done in poly(m) steps, and there are poly(m) of them. Thus, all of Br^n,m can be 
computed in this many operations. 

Br,n,m is a hitting set: Br,n,m is a set of rank-1 matrices by construction, so it remains to prove 
that it hits each low-rank matrix. Let M he nxm matrix of rank < r in F. By Theorem 5.1, we see 
that M is non-zero iff one of the polynomials {fAi{x,g^x)}Q<£^r is non-zero. Thus, fM{ak, g^ak) = 
Ylo<i<n,o<j<m ^id'^kid^'^kV = {M,Bk/). As each fM{x,g^x) is of degree < n + m- 2 and we 
evaluate each polynomial at n+m—1 points, each fM{x, g^x) is fully determined by these evaluations 
via the polynomial interpolation map. Specifically, if fM{x,g^x) is non-zero then it must have a 
non-zero evaluation for some Ok- As some fM{x,g^x) is non-zero by Theorem 5.1, it follows that 
{M, Bk/) / for some < i < r and < k < n + m - 2. □ 

One deficiency with this construction is that for large r it is suboptimal by a factor of 2. 
That is, in the regime where n = m and r = n — 1 this construction gives a hitting set of size 
(2n — l)(n — 1) ~ 2n'^. However, the naive hitting set yields an n^-sized setting. In the next 
subsection we show that this is an artifact of the analysis. That is, by pruning unneeded matrices 
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from the hitting set, one can show that our construction always (for r < n) does better than the 
naive construction. This result proven in Theorem 5.10. 

5.3 An Alternate Construction 

In the previous subsections we saw that a low-rank matrix M is non-zero iff one of the polynomials 
{fM{x,g^x)} was non-zero. To construct a hitting set, we then interpolated each /m at enough 
points to determine which, if any, were non-zero. However, we are interpolating many "related" 
polynomials all on the same points, so it is natural to wonder if there are some redundancies in 
this process. 

To phrase things differently, observe that testing a matrix M against a hitting set T-L is really 
asking of M S ker^. The promise that M is low-rank ensures that M £ kerT-L iff M is zero. The 
number of tests done is \T-L\, but the number of actual tests is rank('H), where we consider Ti as 
vectors in the vector space IK"'". That is, some of the matrices in T-L may be linearly dependent, 
and these are redundancies that can be pruned. 

The aim of this section is to present hitting sets (and improper hitting sets) that have linearly 
independent test matrices. The initial motivation is to observe that the point of the evaluations of 
the {fM{x,g^x)} was to interpolate the coefficients. Thus, instead of doing these evaluations, we 
can express the coefficients of the {fM{x,g^x)} directly as linear combinations of the entries in M. 
This will lead to the following improper hitting set. 

Construction 5.7. Let m > n > r > 1. Let IK he an extension of¥ such that g G ^ is of order 
> m. Let Dk/ E j^^x'^t ^/^g uiuifix defined be 

/n N //^ ifi + j = k 
- \0 else 

Define 'Dr,n,m = {Dk {\ a<k<n+m-2 , and T^'j, = {Dk fl o<k<n+m-2 

' ' ' 0<e<r ' 0<£<min(r,fc + l,(n + m)-(fe + l)) 

We now analyze this construction. 

Theorem 5.8. Let m > n > r > 1. Then T)r^n,m, os defined in Construction 5.7, has the following 
properties: 

n^m is o,n improper hitting set for n X m matrices of rank < r over F. 

2. Span('Dr.^„^m) = Span(;Sr,n,m) (as vectors in ]K"™J 

3. \T>r,n,m\ = {n + ui - l)r 

4- Each matrix in 'Dr,n,m is n-sparse. 

5. Vr n^m can be computed in poly(7Ti) operations, where operations (including a successor func- 
tion in some enumeration ofK) over IK are counted at unit cost. 

and D'r ^^, as defined in Construction 5.7, has the following properties: 

1. T)'rn m "^s an improper hitting set for n x m matrices of rank < r over F. 

2. 'Dr^n,m linearly independent (as vectors in IK"™j and Span('Dr,n,m) = Span(P^^^ „^) 
3- \T^r,n,m\ = {u + m - r)r 
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4- Each matrix inD'^.^^ is n-sparse. 

5. T>'^nm '^^'^ computed in poly(m) operations, where operations (including a successor func- 
tion in some enumeration ofM.) over IK are counted at unit cost. 

Proof. \'Dr^n,m\ = {n + m — l)r: This is by definition. 

Sparsity of T>r^n,m- Each matrix in the hitting set has support in some A:-diagonal, and each 
diagonal has at most n non-zero entries. 

n^m can be computed in poly(?7T,) operations: The details are very similar to the proof that 
Sr,n,m Can be computed in poly(m) operations, as seen in Theorem 5.6, so we omit the specifics. 

n,m is an improper hitting set: Let M be n x m matrix of rank < r in F. By Theorem 5.1, 

we see that M is non-zero iff one of the polynomials {/A/(a;, (7^x)}o<£<r is non-zero. Recall the 
notation that €^k{f) denotes the coefficient of / on x'^. Thus, <tr^k{fM{x,g^x)) = X^j^j^^, Mijg^^ = 

{M,Dk/). Thus, it follows that some fM{x,g^x) is non-zero iff one the inner products {M,Dk/) is 
non-zero. Invoking Theorem 5.1 completes this claim. 

This can also be seen from the fact Span(Pr,n,m) = Span(Br,n,m) ■ Thus, a for a matrix M, 

Spa.n(T>r.n,m) 5 Span(i3r,n,m): For any M (not just those of rank < r) we have that (M, -Dfc,^) = 
e,k{fM{x,g'x)) and {M,Bk,e) = fM{ak,g'ak) and thus {M,Bk,t) = Tl^^"^ {M^D^'^d- By 
taking M for each element in some basis, it follows that B]^^i = X]fc'='o~^ '^k 

Span(D,.^„^m) ^ Span(;Br,n,m): Similar to the above case, we get that for any M, 

n+m-2 / \ 

via Lagrange interpolation. As the coefficients of this linear dependence are independent of M 
(they only depend on the a^), by taking M for each element of some basis it follows that the same 
linear dependence for D/^ i and exists, giving the claim. 

Vrrim can be computed in poly(m) operations: As with 'Dr^n,m, these details are omitted. 

Sparsity of P^„^: Each matrix in the hitting set has support in some /c-diagonal, and each 
diagonal has at most n non-zero entries. 

'^r,n,m is an improper hitting set: This follows from showing that 'Dr,n,m ^ Span(P^ „ „^), as 
this implies that for a matrix M, M £ keicT>r,n,m *^==^ M € kerP^^^. Thus, as T>r^n,m is an 
improper hitting set so is Vj.^^ ^^. 

Span(Pr,„,m) 5 Span(P^„ „,): This is clear, as Pr-,n,m 5 ^r,„,m- 

Span(!Dr,n,m) ^ Span(P^„^): Begin by observing that Dk/ is non-zero only on the /c-diagonal 
and the A;-diagonal has min(/c + l,n, (n + m) — {k + 1)) entries. Further, the /c-diagonals of the 
matrices {-C'A;,^}o<^<r form the rows of a rxmin(A:+l, n, {n+m) — {k-\-l)) Vandermonde matrix. This 
Vandermonde matrix is formed by taking powers of < n consecutive powers of g, which by the order 
of g are distinct. It follows that the first min(r, min(A:-|-l, n, {n+rn) — {k+l))) of the {-Dfc/}o<£<r form 
a basis for the rest. As r < n, min(r, min(/c-|-l, n, (n+m) — (k+l))) = min(r, k+1, (n+m) — (k+l)), 
so {-Dfc,Jo<£<min(r,fc+i,(n+m)-(fc+i)) are a basis for {-Dfc,4o<^<r (recaU that we start indexing from 
zero). Ranging over all k shows that the claim holds. 

'^r,n,m is linearly independent: Notice that Dk/ and Dfc'/' have disjoint support if /c 7^ k' . The 
previous paragraph shows {-Dfc,£}o<£<mm(r,fc+i,{n+m)-(fe+i)) are linearly independent for each k, and 
the fact about disjoint support for differing k shows that taking the union over k does not introduce 
any linearly dependencies. 
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I^r n ml ~ + ~ ''")''"■ -^^^ ^ — ^+1 < (n+m) — r we see that r = min(r, k+1, (n+m)— (/c+l)), 
so „ ^ offers no savings in this regime. For 0<fc<r, takes r — (A; + 1) fewer matrices then 

n,m- For n + m>A; + l>(n + m) — r, T)'^ „ ^ takes r — ((n + m) — (fc + 1)) fewer matrices then 
T^r,n,m- It follows that \V'^ = Ifr.n.ml " f{r — 1) = (n + m — l)r — r(r — 1) = (n + m — r)r. □ 

We sketch another proof of this result in Remark 7.24. 

The above results also imply that rank(;Sr.^„_m) = (ra + m — r)r, which is better than the analysis 
given in Theorem 5.6. This immediately gives that there are explicit (n + m — r)r-sized (proper) 
hitting sets for nxm matrices of rank < r, as we can (in poly(m) steps) find a basis for Br,n,m- This 
basis will consist of rank-1 matrices, and also be the desired hitting set. However, in the interest 
of being more explicit, we present the following construction. 

Construction 5.9. Let m > n > r > 1. Let IK be an extension of¥ such that g G M. is of order 
> m and oq, . . . ,an+m~2 G «re distinct. Let B'f, ^ £ j^nxm rank-1 matrix defined by 

We now give the analysis for this hitting set. 

Theorem 5.10. Letm > n > r > 1. Then B'^. ^ ^, as defined in Construction 5.9, has the following 
properties: 

1. Span;S^^j„ = Spani3r,n,m; where Br^n,m is defined in Construction 5.5. 

2. B'^ j^ „^ is a hitting set for nxm matrices of rank < r over ¥. 

3- \B'r^n,m\ = {n + m - r)r 

4- ^'r,n,m, linearly independent (as vectors in IC"™J 

5. can be computed in poly(m) operations, where operations (including a successor func- 

tion in some enumeration o/IfCj over K are counted at unit cost. 

Proof. \Br,n,m\ = {n -\- m — r)r: The size is equal to X]^=o((n + m— 1) — 21) = (n + m — l)r — 2(2) = 
(n + m — r)r. 

B'^n^ can be computed in poly(m) operations: The details are very similar to the proof that 
Br,n,m csui be Computed in poly(?n) operations, as seen in Theorem 5.6, so we omit the specifics. 

B'j.^^ is an hitting set: This follows from showing that Br,n,m ^ Span(5^^^), as this implies 
that for a matrix M, M G ker 5r.,n,m ^^=^ M £ kerB'^.^^. Thus, as Br,n,m is an hitting set so is 
B' . 

Spanfi^„ ,„ C SpanBr,n,m'- This is clear as C Br,n,m- 

Span;B^„„ 5 SpanBr,n,m' We will actually show Vr^n,m ^ Span 5^ which by Theorem 5.8 
is sufficient. Let M be any matrix (even of rank > r). We will show that the inner-products 
{M,B'^^j^) determine the inner-products {M,'Dr^n,m) ■ Then we show that this implies the claim. 

Recall that the iniiGr-product of ci matrix D G T^r^n,m 

is simply a coefficient ft^k(fM{x,g^x)) for 
some 0</c<n + m — 2 and < i < r. So to prove the claim we will speak of these coefficients 
determining other such coefficients. 

Now observe that for any k £ {0, ...,r — 1}, the coefficients ^^k{fM{x,x)), €^k{fM{x, gx)), 
. . . ,(txk(fM{x,g^~^x)) are linear combinations of the k + 1 < r elements in {Mij}i^j=k- Just as 
in the analysis of T^'mm Theorem 5.8, the first /c + 1 of these linear combinations are rows 
of a Vandermonde matrix over distinct numbers, and thus these linear combinations span all 
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vectors. Thus, it follows that the coefficients {C^fc/j./ (x, (7*2;)}o<j<fc+i determine the coefficients 
{(!:^kfM{x,g'x)}o<i<r- 

Similarly, for any k G {{n + m) — (r + 1), . . . , (n + m) — 2} the coefficients 
{^^.fe/Af (x, g*a;)}o<i<(n+m)-(fc+i) determine the coefficients {C^-fe/Mls:, fl'*ic)}o<i<r- We now use these 
facts in the following claim. 

Claim 5.11. The coefficients of fM{x, g^~^^x) are determined by the coefficients of 
fM{x,x),fM{x,gx),... , fM{x, g^x) and the evaluations of fM{x, g^'^^x) to any (n+m — 1) — 2(A;+1) 
distinct points. 

Proof. By the above reasoning, the coefficients (t^k' [fuix, g^^^x)) with k' G {0, . . . , A;} U {(n + m — 
2) — /c, . . . , (n + m) — 2} are already determined by the coefficients given. 
Now, consider the polynomial 



.jM{x,g'^'x)-YX^^,<i,y{fM{x,9'^'x))x^' -T.^^^^^^ 

h[x) — 



By construction, h of degree < (n + m — 2) — 2{k + 1), and evaluation of h is possible given oracle 
access to fuix^g^^^x) as the relevant coefficients referenced are already determined. 

Thus, it follows that h is determined by interpolation at any (n + m — 1) — 2{k + 1) distinct 
points. Once h is determined, the above equation determines the as yet undetermined coefficients 
of /a/(x,/+^x). □ 

Thus, to determine all of the coefficients of the polynomials {f m{x , g^ x)}Q<ii^,. we first interpo- 
late fAiix, x) at n + m — 1 distinct points. The above claim then shows how to interpolate fAiix, gx) 
using (n + m — 1) — 2 evaluations to fuix, gx), given access to the coefficients of fuix, x). Inducting 
on the above claim shows we can interpolate all of the coefficients in {fnix, g^x)}o<£^r from the 
evaluations {fMidk, g^O(k)}o<e<r,o<k<(n+m-2)-2e- Rephrasing this, we see that the inner-products 
(M, Pr,n,m) are determined by the inner-products {M,B'j.^^). 

Now consider a matrix B ^ Span 5^ „ ^. It follows that the dual space of „ ^ is strictly larger 
than the dual space of B'^. „ ^U{B}, so that there is a non-zero matrix Mq such that (Mq, B'^. nm) — ^ 
but {Mq,B) 7^ 0. But as {0nxm,13rnm) = ^ ^nd {Onxm,B) = 0, it follows that the inner-product 
(Mo,fi^„ „j) does not determine the inner-product {Mq,B). As (M, ;S[, „ „j) determines {M,'Dr^n,m) , 
it must be that 'Dr^n,m ^ Spani3,^„ „^. 

K,n,m is linearly independent: As Span(i3^„,„) = Span(P;„,„), = \V'^^ri,m\i and V'^^^,^ 

is linearly independent, it follows that ;Brn.m is also. □ 

Thus, we achieve an explicit hitting set of size {n + m — r)r. For r = n we see that this 
equals nm, matching the naive bound. For r < n — 1, {n + m — r)r is increasing with r, so 
{n + m — r)r < {n + m — (n — l))(n — 1) = (m -|- l)(n — 1) = nm + n — m — 1 < nm. Thus, we see 
that our hitting set is always smaller than the naive hitting set, for r < n. 



6 Identity Testing for Tensors 

In this section we show how to construct hitting sets for [nj'^ tensors of arbitrary degree d. We will 
only discuss tensors of shape for simplicity. The proof technique will be to use the results for 
d = 2 as a black-box as a way to induct on d. That is. Corollary 5.4 shows that one can test identity 
of degree < n, rank < r bivariate polynomials by testing the identity of r univariate polynomials, 
each of degree < 2n. This effectively reduces the d = 2 case to the d = 1 case, while increasing the 
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number of polynomials to test by a factor of r. As degree < 2n univariate polynomials can be fully 
interpolated cheaply, this shows that this is a viable base case for recursion. 

Intuitively, it seems like this variable reduction process should be able to be continued so that 
a rank < r d-variate polynomial can be identity tested by testing identity of ~ r'^ univariate 
polynomials each of degree ~ dn. This is indeed possible. However, we are able to do better here 
by using a reduction process that reduces a d-variate polynomial to a (i/2-variate polynomial while 
only increasing the number of polynomials to test by a factor of r. Thus, a d-variate polynomial 
can identity tested by testing ~ r^^'^ univariate polynomials, each of degree < dn. Unfortunately, 
this set of polynomials will require ~ (dn)'^ time to construct. 

The section will be split into two parts. The first will state the variable reduction theorem that 
was mentioned above. The second part will detail the hitting set arising from this theorem. 

6.1 Variable Reduction 

As with the d = 2 case, will need a variable reduction result in order to construct our hitting set. 
We detail this result in this subsection. We first illustrate some lemmas about variable reduction. 

Lemma 6.1. Let f{xi, . . . , Xd) be a d-variate polynomial. Let vr : [d] — )■ [d] be a permutation. Then, 
f{xi,...,Xd) = iff f{x„(i),. . . ,x„(^d)) = 0. 

Proof. Consider the map N'^ — )■ N'^ defined by (ii, . . . ,id) ^ (v(l)) ■ ■ ■ , ia{d))- This is exactly the 
action on the degrees of monomials over the variables xi, . . . ,Xd when performing the substitution 
Xi I— )• Xo-(i) . Note that this map is bijective. 

Thus, when mapping f{xi, . . . , Xd) to /(^^-(i), . . . , x„(^d)) we see that there can be no cancella- 
tions, as distinct monomials are mapped to distinct monomials. Thus, the two polynomials have 
the same number of non-zero coefficients. In particular, they are either both zero or non-zero. □ 

The above lemma is most useful in conjunction with the next lemma, which shows a simple 
d-variate to (d — l)-variate reduction. 

Lemma 6.2. Let f{x, y,zi,...,Zd) be a (d + 2)-variate polynomial such that deg^{f) < n. Then 
for any m > n, f{x,y,zi, . . . , Zd) = iff f{x,x"^, zi, . ..,Zd)=0. 

Proof. Consider the map — )■ N'^+i defined by {ii,i2,i3, ■ ■ ■ ,id+2) ^ {h + ™2^ii, ■ ■ ■ ,id+2)- 
This is exactly the action on the degrees of monomials over the variables x,y,zi,...,Zd when 
performing the substitution y i— t- x"^. 

Notice that this map is injective when restricted to [nj x N'^+-^, as n < m. That is, if i -|- mj = 
i' + mj' with («, j), {i' S InJ x Z then i = i' mod m which means i = i' , and thus j = f as well. 

Thus, when mapping /(x, y,zi, . . . , Zd) to f{x, x™, zi, . . . , Zd) we see that there can be no cancel- 
lations, as distinct monomials are mapped to distinct monomials. Thus, the two polynomials have 
the same number of non-zero coefficients. In particular, they are either both zero or non-zero. □ 

The above lemmas show that we can "reshape" our polynomials, in the sense that we have fewer 
variables but larger individual degrees. To perform our d-variate variable reduction, we will reshape 
our polynomial into a bivariate polynomial, such that the individual degrees are now ~ n'^/^. We 
can then apply our bivariate variable reduction to get a univariate polynomial of degree ~ n*^/^. 
One can then reverse the reshaping, to yield a d/2-variate polynomial, with individual degrees ~ n. 
One then recurses appropriately. 

In order to understand the recursion pattern sketched above, we will introduce the following 
function. 
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Definition 6.3. Letn>l,b>0. Let < k < 2'^. Define 



Ln,b{k,k,...,id)= Yl ijin2')^'^^'i 



l<3<d 
[fe/2J-lj=l mod 2 

We now observe that it obeys the following properties. 
Proposition 6.4. Letn>l,b> 0, with 0<k<2'^. Then 
1. 



Ln,bik,ii, ...,id) 



ifk = 

ii(n2'')L^/2J +L„,fc([A:/2j,i2,...,irf) = 1 mod 2 
^Ln,b{[k/2\,i2,...,id) else 



2. For b>l, L2n,b^i{k,ii, . ■ ■ ,id) = Ln,b{k,ii,. ■■,id) 

3. K,b{k,k,...,id) < (n2*)L^/2j J2jeld]ij 

4- Ln^b{k, ii, . . . ,id) can be computed in time poly(|n|, b,d,k,\ii\, . . . ,\id\), where \- \ is the length, 
in bits, of a number. 

Proof. (1): We first note that [[A;/2^J/2^' J = [A;/2J+J' J , which is most easily seen by observing that 
these operations bit truncate (on the right) the binary representation of /c. If fc = then in both 
formulas L^^ik, ii, . . . ,id) = 0. If /c = 1 mod 2, then 



Lr,,b{k,ii,...,id)=h{n2')W^i + Yl i,{n2')W^'i 

2<]<d 
[fe/2i-lj =1 mod 2 

= n(n2^)L'=/2J+ ^ i,.+i(n2'')L*^/2^"'J 



l<j<d-l 
[fe/2J-2j=i mod 2 

zi(n2'')L'=/2J + Y b-+i(n2'')LL'=/2J/2^J 



l<j<d-l 
LLfe/2J/2i-lj=l mod 2 



zi(n2'')L'=/2J+L„,,(LA;/2j,Z2,..., 



which is exactly the above recursion. The case /e = mod 2 is analogous. 



(3) 
(4) 



The definition of Ln^h only depends on n2^ . Thus, as 2n ■ 2''"^ = n - 2^ , this is immediate. 
This is immediate. 

The natural way of computing the formula Ln^b{k,ii, . . . ,id) is done in the given time 



bound. □ 

We will now prove our multi-variate variable reduction theorem. We prove here the case when 
the number of variables is a power of 2, for simplicity. The general case, with some loss, will follow 
as a corollary. The following notation will make the presentation simpler. 

Notation 6.5. Let • • • , /ifc(i))j=i) denote 

/(/ii(l), . . . , hkil), hi{2), hk{2), hiir), hk{r)) 
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We will use this notation heavily in the following proof. 

Theorem 6.6. Let n > 1, d > \ and b > d — 1. Let IC be an extension of ¥ such that g £ M. 
has order > (n2^)^ . Let T : [np ¥ be a tensor of rank < r. Let fxixo, . . . ,X2d_i) = 



^^=1111=0^ Pi A^i) ' where degpi^i < 



n. 



Then fx is non-zero ( over ¥) iff one of the univariate polynomials in the set 



is non-zero (over'K). 

Proof. The proof will be by induction. For simplicity we write / for f^- 

d =1: Note that L„^b(0,zi) = and L„^b(l,zi) = ii, so this case follows from Corollary 5.4. 
d > 1: We will first reshape / into a bivariate polynomial, and appeal to the d = 1 case. We 

will then un-reshape this polynomial into a 2'^~^-variate polynomial, and then appeal to induction. 
By induction on Lemma 6.2 (and appealing to Lemma 6.1 to see that Lemma 6.2 applies to 

any two variables, not just the first) we see that 

= iff /((-r^-r^^ro^-^) = o (d 

(where so far we only need that 6 > 1). 

We split the rest of the proof into two claims. The first claim shows how we can, using the 
bivariate case, test identity of the right-hand-side of Equation (1) by testing identity of a set of 
r polynomials, each of 2'^~^ variables. The second claim shows how testing identity of these new 
polynomials can be reduced to testing identity of univariate polynomials, where we use the induction 
hypothesis. 



Claim 6.7. 

iff 



f{{xr'\-r'')to~')=o 



{f{{xJ,g^^'^-''yx,)f^,'-')}0<^,<r = 

Proof. First observe that 

/(xo,xo^=V((4"^^^xr^^),c-^) 

— J\XO,XI,Xq ,Xi , Xq ,X]^ , . . . , 

2d-i_i 




= 2Z n p^ja^o ) n p^j+^A^i 

1=1 \ j=0 ) \ j=0 

so we can apply Corollary 5.4 to see that f'{xo,xi) = iff {/'(xq, (7*^xo)}o<ii<r = 0, which, when 
expanded, is equivalent to 

{/((4"^^^.^^('^^^)^4"''^^>ro"^)}o<.<.=o 



using that the order of g is > {n2^)'^ > deg^.^ /',deg^,j /'. Using that 2'' > 2*^ ^ > 2, we can 

undue the variable substitutions Xj ^ Xq ' . That is, applying Lemma 6.2 in reverse, we see that 
the above set of polynomials is zero iff 

{/((^i,9^^^"'')'^j>?!:o'"')}o<n<r = o 

which is exactly the claim. □ 
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Claim 6.8. 

iff 

Proof. First observe that 

so /' is 2'^~^-variate, having individual degrees < 2n. Thus, applying induction to the theorem for 
the 2'^~^-variate case (and using 6—1 instead of b, noticing that b — l>{d — 1) — I also holds), we 
get that f'{xo, xi, . . . , X2d-i_i) = iff 

or in terms of /, 

where we have used that the order of g > (n2^)^'' ^ > (2ri-2^~^)^''' ^' ^. Invoking Proposition 6.4.(2) 
and Proposition 6.4.(1) we see that the above polynomials being zero is equivalent to 

and reindexing, this is equivalent to 

{/((/„,.a,n,.....)^)2^-l)}^^^^_ .^^^ = 

which is the claim. □ 

Chaining together Equation 1 and the above two claims, yields the theorem. □ 

Remark 6.9. Let D = 2*^. In the above proof we use a recursion scheme that reduces to the problem 
when D ^2 and D ^ D/2. This gives rise to the recursion T{D) < T{2) + T{D/2), where T{D) 
is the minimum number such that a D-variate rank < r polynomial can be identity tested using 
r ^ ' univariate polynomials. There is also the recursion S{D) < r{Dn)^/^ + S{D/2), where S{D) 
is the maximum degree of g seen in this reduction to the univariate case. 

One can do slightly better than this scheme by using the "square root trick" , where we break 
up the D-variate case into two copies of the -v/I)-variate case. This yields the recursions T{D) < 
2T{VD) and S{D) < r{Dn)^ ■ S{VD) + S{VD). This yields the same solution to T, but has now 
that S{D) = 0{r{Dn)^(^^) instead of r{Dn)^/^. While this is an improvement, it is somewhat 
mild. 

Similarly, one can give other recursion schemes that minimize S (so it is po\y{n, D, R)), but at 
the cost of making T[D) ~ D. 
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6.2 The Hitting Set for Tensors 

In this subsection we use the variable reduction theorem of the last subsection to construct hitting 
sets for tensors. First, recall our notion of a hitting set for tensors from Section 3, as well as the 
definitions of the polynomial fx and fx associated with T. As <t idifT) = T{ii, . . . ,id) we see 

that r = iff /t = 0. Theorem 6.6 shows that = iff a set of univariate polynomials are all 
zero. Thus, to test if T is zero we can interpolate each of these polynomials. As these polynomials 
are defined via fx, these interpolations can be realized as inner-products with T. This will yield 
our hitting set, which we now make formal. 

Construction 6.10. Let n,r > 1 and d > 2. Let K. be an extension of ¥ such that g & M. is of 
order > {2dn)'^ and ai, . . . ,adn G IK are distinct. Let : Iri^"^ "K to be the rank-1 

tensor defined by 

d 

i=i 

and let Bd,n,r'= {Bk,ei,...,e^i^j^^ }o<£i,...,^pigdl <r,l<k<dn- 

We now give the analysis for this hitting set. 

Theorem 6.11. Let n, r > 1 and d > 2. Then Bd^n,rj o,s defined in Construction 6.10, has the 
following properties: 

L Sd,n,r is a hitting set for Jn]'^ tensors of rank < r over F. 

2. \Bd,nA = dnr^S'^l 

3. Bd, n^r can be computed in poly((2(in)'^, r^^'^*^^ ) operations, where operations (including a suc- 
cessor function in some enumeration ofM.) over IK are counted at unit cost. 

Proof. \Bd^n,r\ = dnr^^^'^: This is by definition. 

Bd^n,r can be computed in po\y{{2dn)'^ ,r^^^^) operations: We assume here an enumeration of 
elements in K such that the successor in this enumeration can be computed at unit cost. We also 
will assume testing whether an element is zero, as well as the field elements, are done at unit cost. 

First observe that there are at most {2dn)'^ solutions to x'-^'^"^ — 1 over K., so if we enumerate 
(2dn)'^ + 1 elements of IK, they we can find a 5 G IC with order > {2dn)'^ . This is in po\y{{2dn)'^) 
operations. Similarly, the enumeration will give us dn distinct elements which yield the desired a^. 

By Proposition 6.4, -t/njigrfl (j) ^i) • • • 1 ^[igd] ) can be computed in poly(d, n, r) steps, and this 
number is < {2dn)'^, so computing gr^",rigdl (-^'^I'-'-'^rigdi ) will take at most po\y{{2dn)'^,r) operations. 
Computing the powers of ak will take po\y{d,r) time. Thus, each -BA;,£i,.../pigd] done in 

po\y{{2dn)'^ ,r^^^'^) steps. As there are poly (dnr^^^'^) of them, all of Br,n,m can be computed in 
po\y{{2dn)'^ ,r^^^^) operations. 

Bd^n,r is a hitting set: By construction Bd^n,r is a set of rank-1 tensors, so it remains to show that 
it hits each low-rank tensor. Consider any T : [n]]'^ — ?• F of rank < r. We now apply Theorem 6.6 to 
fx, where we consider fx as a 2 l^^s -variate polynomial of rank < r (by padding fx with dummy 
variables), individual degrees < n, and taking b = [Igd]. This shows that = iff 
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(over K). Each of the above univariate polynomials has degree < d{n — 1), so interpolating them 
at dn > d{n — 1) + 1 points will completely determine them. In particular, the above polynomials 
are zero iff all the evaluations at any dn are zero. 

Now we observe, just as in the matrix case, that evaluating the . . . , £|-ig(^-| )-th polynomial 
in the above set at the point is exactly the same as the inner product {T^Bk/T^..../^^^^-^) ■ Thus, 

T = iff /t = iff all of these inner-products is zero. This exactly means that Bd,n,r is a hitting 
set. □ 

We remark that this hitting set is of quasi-polynomial size as a rank < r tensor T : — ?• F 
can be represented using dnr field elements. However, its construction time is exponential in d. 
We leave it as an open question as to whether the construction time can be made to match (up to 
polynomial factors) the size of the hitting set. 

6.3 Identity Testing for Tensors over Small Fields 

Thus far we have assumed the existence of an element gf G IC of large order. In doing so, all of 
our hitting sets are tensors over the field ]K instead of the base field F. While this is a common 
assumption when the polynomials of interest are of high degree, the polynomials arising from [nj"' 
tensors on dn variables are of degree < d, so hitting sets still exist for when F is 0{d) sized (as seen 
in Lemma 3.13). In this section, we explore this question and show how to transform hitting sets 
over K to hitting sets over F, with some loss. Combining this with the above results, we construct 
explicit hitting sets over any F. 

We first detail a field simulation result that produces improper hitting sets. 

Proposition 6.12. Let K be an extension of¥, with k = dimpIC. For £ € [[/c|, let ip£ : K ^ W'' 

denote the k projection maps to the standard basis coordinates o/K. 

Let % C ]KII"]1 be an improper hitting-set for [[n||'^ tensors of rank < r. For H G H define Hi by 

(-ff€)ii,...,jd = H^l{Hn,...,ii) 

and define 
Then 

1. If all tensors in % are s-sparse, then so are all tensors in %. 

2. \n\ = k-\n\. 

3. ik. is an improper hitting set for [nj'^ tensors of rank < r. 

Proof (1): If = then it follows that {He)ii,...,i^ = for all £. 

(2) : This is by construction. 

(3) : Let ao, . . . ,ak-i be the standard basis for IfC as a F-vector-space. Then it follows that 

Consider some tensor T : [[n]"^ — )• F of rank < r. Then we know that there is some H such 
that (T, H) 7^ 0. It follows that there must be some £ with (T, Hn) 7^ 0. □ 

We now apply this to our hitting set results. 

Corollary 6.13. Let m > n > r > 1. Over any field F, there is an poly {m)- explicit improper 
hitting set for n x m matrices of rank < r, of size 0{rm\gm). Further, each matrix in the hitting 
set is 0{n) -sparse. 



29 



Proof. If F has an element of order > m, then Theorem 5.8 suffices. 

If not, let K be an extension field of F such that dimp K = @{lg m), and thus there is an element 
of order > m in K. Such an extension can be explicitly described by an irreducible polynomial over 
F of degree 0(lgm), which can found in poly(m) time, in which time we can also find g. Using 
Theorem 5.8 to get an n-sparse (improper) hitting-set over IK for these F-matrices, and applying 
Proposition 6.12 yields the result. □ 

Corollary 6.14. Let n,r > 1, d > 2. Over any field F, there is an po\y{{2ndf,r'^^^^'^^)- explicit 
improper hitting set for In^'^-tensors of rank < r, of size 0{dnr'-^^^^'^^ ■ {dlg2dn)). 

Proof. If F has an element of order > {2nd)'^, then Theorem 6.11 suffices. 

If not, let K be an extension field of F such that diniplK = Q(dlg{2nd)), and thus there is an 
element of order > [2nd)'^ in IK. Such an extension can be explicitly described by an irreducible 
polynomial over F of degree Q){d\g2nd), which can found in po\y{{2nd)'^) time, in which time we 
can also find g. Using Theorem 6.11 to get a hitting-set over K for these F-matrices, and applying 
Proposition 6.12 yields the result. □ 

The above results only yield improper hitting sets. We now show how to preserve the rank-1 
property of the original hitting set, and thus get proper hitting sets over small fields. To do, we 
first recall a standard fact in algebra showing that IK is isomorphic to a subring of F-matrices. 

Lemma 6.15. Let IK he an extension of¥, and let k = dimpIK < oo so that IK = F*^ as vector 
spaces. For any a G IK define the linear map fj,a '-^^ ^ given by the multiplication map x i— )• ax. 
Let Ma € f'"'^*^ be the associated matrix. Then the map M(.-) : IK — t- f'"'^*^ is an isomorphism as 
¥-algebras. 

Proof. The map is clearly well-defined. To see the additive homomorphism, note that as (a + /3)7 = 
a7 -|- /37 for any a, /3, 7 € IK, it follows that M^+jS • 7 = Maj + Mg7 for any 7 € F'^ = IK (where we 
abuse notation by writing 7 to denote an element in IK as well as its representation as a vector in 
F'^). Taking 7 for each vector in some basis shows that Ma+/3 = Ma + M^. 

Similarly, to see the multiplicative homomorphism note that for any ct,l3,j G IK we have that 
(q/3)7 = a{f3j). Thus it must be that MaMp"f = a/Sj = Map^. Again, taking 7 over each vector 
in a basis determines a linear operator. Thus it must be that MqM^ = Map. 

Noting that for a G F we have that Mq, = al^ we then gain F-linearity of the map. 

If a / then Ma ■ M„-i = Mi = 4, so Ma is invertible. Thus, if Ma = M^j then Ma-p = Ok, 
which implies that a — /3 = (as else Ma-13 would be invertible) and thus a = (3. This implies the 
map is injective. 

As a map is surjective onto its image by definition, this establishes the F-algebra homomorphism. 

□ 

We now show how to use this alternate representation of K as a way to simulate hitting sets 
defined over IK by hitting sets defined over F. 

Proposition 6.16. Let IK he an extension of¥, with k = dimpIK. Let T-L C IkW he a hitting-set 
for [uY^ tensors of rank < r. For H = ®'j^iVj G IKII"^ define G by 

where Af^.-j : IK — t- F^^^ is the isomorphism of Lemma 6.15 and define 

d 

Heo,...,ea = Vj-£o,.../d 
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and define 

^ = {Heo,.../a-ifi}Hen,o<eo,-/d~.i<k 

Then 

1. ik. is a set of rank-1 ¥-tensors of shape [n]*^. 

2. \n\ = k'^-\n\. 

3. ik. is a hitting set for \n\'^ tensors of rank < r. 

Proof. (1): This is by construction. 

(2) : This is by construction. 

(3) : Consider some tensor T : InJ'^ — )■ F of rank < r. Then we know that there is some H ^ Ti 
with H = (8)^=1 Vj, such that (T, H) / 0. Then we see that (we now abuse notation, by writing jj, 
now to denote the map M(.)) 

/ 

\ii,...,ideH j=i 

ii,...,ideM \i=i 

fuhy expanding the matrix multiphcation of d matrices, each k x k, 

d 

= Y T{ii,...,id) Y n^((^i)^.)^,_i, 

ii,...,id6M £iA,...,^d-ieMi=i 



= Y Y ^(^i'---'^rf)n^((^j) 

fiA,.../d-i6Wn,-,id6M i=i 

= Y {T,Heo,-,ed) 
<?i,£i,.../d-ieW 

So it follows that if /i((T, H))ii^^£^ / then there is some £i, . . . , l^-i G M such that (T, H^^^,,,/^) / 
0. 

Let 7o denote the element in IK corresponding to gq G F*' (the standard basis vector with a 1 in 
the zero position). Note that 70 7^ 0. Then it follows that for any a € K that M^gq = M^^q = 070 
(where we abuse notation by writing a7o to denote an element in IK as well as the vector representing 
a7o m F*^). Thus, a is fully recoverable from M^eo, and in particular, a = iff M^eo = 0. 

Thus, to test if {T,H) = (over K) it is enough to test if f-t{{T, H))^^^ = (over F) for all 
io G IkJ. Combining this with the above we see that {T,T-L) = (over K) iff {T,T-L) =0. □ 

We now use the above result to get hitting sets for matrices and tensors over any field. 

Corollary 6.17. Let m > n > r > 1. Over any field ¥, there is an po\y{rn)-explicit hitting set for 
n X m matrices of rank < r, of size 0{rmlg'^ m). 
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Proof. If F has an element of order > m, then Theorem 5.6 suffices. 

If not, let K be an extension field of F such that dimp K = @{lg m), and thus there is an element 
of order > m in K. Such an extension can be explicitly described by an irreducible polynomial over 
F of degree 0(lgm), which can found in poly(m) time, in which time we can also find g. Using 
Theorem 5.6 to get a hitting-set over IC for these F-matrices, and applying Proposition 6.16 yields 
the result. □ 

Corollary 6.18. Let n,r > 1, d > 2. Over any field F, there is an poly {{2ndY,r'^^^^'^^) -explicit 
hitting set for InJ'^-tensors of rank < r, of size 0{dnr'~^^^^'^\d\g2dn)'^). 

Proof. If F has an element of order > {2ndY , then Theorem 6.11 suffices. 

If not, let K be an extension field of F such that dimpK = Q{d\g{2nd)), and thus there is an 
element of order > {2ndY in I^- Such an extension can be explicitly described by an irreducible 
polynomial over F of degree Q{d\g2nd), which can found in poly((2n(i)'^) time, in which time we 
can also find g. Using Theorem 6.11 to get a hitting-set over IK for these F-matrices, and applying 
Proposition 6.16 yields the result. □ 



7 Explicit Low Rank Recovery of Matrices 

Thus far we have discussed identity testing for matrices (and tensors). There the main concern is to 
(deterministically) determine whether the matrix is identically zero. However, we may also ask for 
more, in that we may want to (deterministically) reconstruct the entire matrix. Throughout this 
section we will only discuss deterministic measurements which are linear (so are inner products with 
the unknown matrix or vector), non-adaptive (so the measurements are independent of the unknown 
matrix or vector) and noiseless. The focus on deterministic measurements differs from prior work, 
which typically focuses on showing that certain distributions of measurements allow recovery with 
high probability. That the measurements are restricted to be linear is a common assumption in 
compressed sensing. Non-adaptiveness is also a common assumption, but it is important to note 
that recent work [IPWll] shows that adaptivity in (noisy) sparse-recovery can be more powerful 
than non- adaptivity. Finally, we assume our matrices are exactly rank < r, not just close to 
some matrix that is rank < r, and we assume that our measurements are noiseless. This is not 
quite practical for compressed sensing, but some previous work also makes this assumption [GK72, 
Gab85b, Gab85a, Del78, Rot91, Rot96, RFPIO]. Further, the noiseless case is more natural for our 
applications to rank-metric codes, and allows the results to be field independent. 

We begin by noting that low-rank recovery (recall Definition 3.9, which we consider in this 
section only for matrices) generalizes the notion of sparse-recovery, which is the defined formally 
as the following. 

Definition 7.1. A set of vectors V C K" is an s- sparse-recovery set if for every vector x G F*^ 
with at most s non-zero entries, x is uniquely determined by y, where y G is defined by 

dcf / \ r > 1 

yv = (x,v), /or V G V. 

An algorithm performs recovery from IZ if, for each such x, it recovers that x given y. 

That LRR generalizes the sparse-recovery is formalized in the following claim. 

Lemma 7.2. Given an r-low-rank recovery set TZ for n x n matrices, there is a set V C F", 
efficiently constructible from IZ, with |V| = [R], such that V is an r- sparse-recovery set. 
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Proof. Given an r-sparse vector x G F" construct the diagonal matrix A G ]p"X" with x on its 
diagonal. Thus, A is rank < r. Thus, if we can perform r-low-rank-recovery we can also do r-sparse 
recovery. Each such measurement of A can be seen to also be a linear measurement of x, so this 
yields V. □ 

The purpose of this section is to show that the two problems (when concerned with non- 
adaptive, exact measurements) are essentially equivalent. That is, one can (efficiently) perform 
low-rank-recovery given any construction of a sparse-recovery set. 

To motivate the reduction from low-rank-recovery to sparse-recovery, we will show that our 
above hitting set results already imply low-rank-recovery results, and that these hitting sets can be 
seen as being constructed from a well-known sparse-recovery construction. We begin by recalling 
Lemma 3.10 (standard) fact that any hitting set family yields a low-rank-recovery family, so in 
particular our results do so. Combining the above with our constructions of hitting sets, we derive 
the following corollary. 

Corollary 7.3. The sets B2r,n,m, T^2r,n,m, ^2rnm? ^''^d^'2rnm (from Construction 5.5, Construc- 
tion 5.7 and Construction 5.9) are r-low-rank-recovery sets. 

However, the above results are non-constructive. That is, they show that recovery is 
information-theoretically possible from this set of matrices, but do not give any insight how to 
perform this recovery efficiently. The purpose of this section is to show that we can strengthen 
Corollary 7.3 such that the recovery can be efficiently performed. 

To motivate our recovery ctlgorithm, let us first discuss tlie r-low-rctnk-recov6ry set '2^2r,n,m* 

For 

an n X m matrix M, consider the constraints that the system {M,'D2r,n,m) = imposes on M. 
By construction of T>2r,n,m, we see that each fc-diagonal of M has 2r constraints imposed on it. 
If we write the fc-diagonal of M as x, we can express the constraints on x as ^x = 0, where A 
is of size 2r x |x|, where |x| denotes the size of the /c-diagonal. Further, A has the format (when 
2r <k + l<n) 



/I 
1 
1 



1 



1 



\1 5^-1 . 

which is important because of the following claim. 



9 



1 



1 



9 



2(\x\-l) 



9 



(2r-l)(|x|-l) 



(2) 



Lemma 7.4. Let x he an r-sparse ¥ -vector. Let g he of order > |x| in some extension IK o/F, and 
let A he an 2r x |x| sized matrix of the form in Equation (2). Then x is determined hy Ax.. 

Proof. Suppose x and y are two r-sparse vectors such that Ax = Ay. By linearity we then have 
that ^(x — y) = 0, so that A has a linear dependence on < 2r of the columns. 

However, as the order of g' is > |x|, each 2r x 2r minor of ^ is a Vandermonde matrix on distinct 
entries, and so is full-rank. In particular, any linear dependence on < 2r of the rows must be zero. 
So X — y = 0, so X = y. Thus, x is determined by Ax.. □ 

Note that the row-space of the above matrix is a Reed-Solomon code, and so the above lemma 
shows the standard fact that the dual Reed-Solomon code has good distance. In particular, we can 
do error correction for up to r errors. This is exactly the question of r-sparse recovery (when we 
are correcting errors from the codeword). 
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This lemma shows that at each /c-diagonal, 'D2r,n,m embeds an r-sparse-recovery set. Thus, 
it seems plausible that a low-rank-recovery algorithm for T>2r,n,m might only use this fact in its 
construction, and thus show low-rank-recovery can be done whenever each of the fc-diagonals are 
measured according to an r-sparse-recovery set. Indeed, this is what is shown by Theorem 7.19. 

The reduction from low-rank-recovery to sparse-recovery is detailed in the following two sub- 
sections. The first subsection details a slightly stronger notion of sparse-recovery, which we call 
advice-sparse-recovery. This notion requires sparse-recovery when supplied with some advice on the 
support of the unknown vector. This is the correct notion of sparse-recovery when attempting to do 
low-rank-recovery, but the standard notion is sufficient with some loss in parameters. We describe a 
well-known algorithm, known as Prony's method, for efficiently performing the recovery illustrated 
in Lemma 7.4, and show that this method can be modified to also achieve advice-sparse-recovery. 

The second subsection gives the reduction from low-rank-recovery to sparse-recovery. Combin- 
ing this with our modifications to Prony's method, we conclude that the low-rank-recovery shown 
in Corollary 7.3 can also be performed efficiently. 

7.1 Prony's Method and Syndrome Decoding of Dual Reed-Solomon Codes 

In this section we detail an algorithm for efficiently performing the sparse-recovery demonstrated 
in Corollary 7.4. While our discovery of the algorithm was independent of prior work, it was 
original detailed by Prony [dP95] in 1795 and is well-known in the signal-processing community 
(see [PCM88] and references there-in). It can also be seen as syndrome decoding of the dual to 
the Reed-Solomon code. What we detail here is not exactly the original method, as we seek an 
advice-sparse-recovery set, which is a slightly stronger condition which will be useful in our low- 
rank-recovery algorithm. In coding theory terminology, we are seeking to syndrome decode the 
dual Reed-Solomon code in the presence of erasures. We now define this stronger notion. 

Definition 7.5. A set of vectors V C F" is an s- advice- sparse-recovery set if for every S G 
(gj, and vector x G F" with < s - |5|/2 non-zero entries outside of S, x is uniquely determined 

by S and y, where y G is defined by ?/v =^(x, v), for v G V. 

An algorithm performs recovery from V if for each such x, it recovers that x given S and y. 

Note that the vector y can also be defined as y = l^x, where y G F^^" is the matrix whose 
rows are those vectors in V. 

The motivation for this new definition is to capture situations where x is known to have sparse 
support overall, and further some of its support is already known and given by the set S. The 
results below show that exploiting this knowledge allows |V| to be smaller. To see why this might 
be intuitively plausible, one can count degrees of freedom. In an s-sparse vector x, there are 
intuitively 2s degrees of freedom: it takes s degrees to determine Supp(x), and it takes s degrees 
to determine (xi)jgsupp(x)- 

In the above definition of a s- ad vice-sparse-recovery set, the unknown vector x can have a 
support of size 2s (when IS*] = 2s). If one ignores the set S, there would be 4s degrees of freedom, 
by the above argument, leading one to expect a lower bound of "|V| > 4s". However, if one 
exploits this knowledge, then there are only s — \S\/2 degrees of freedom to determine Supp(x), 
and |5| + (s — |5'|/2) degrees of freedom to determine (a;j)jgsupp(x)) which gives a total of 2s degrees 
of freedom. 

Thus we see that using the information given in S can reduce the degrees of freedom in x, and 
below we match this intuition by recovering x from 2s measurements. This intuition is the same 
intuition in coding theory that an erasure is a "half error" , but specialized to syndrome decoding. 
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In the next subsection, we will see that r-low-rank-recovery reduces to the problem of r-advice- 
sparse-recovery. When 5 = then r-advice-sparse-recovery is exactly the notion of an r-sparse- 
recovery. However, we will need S to have size up to 2r. Note that regardless of the size of S, x 
will be 2r-sparse. Thus the following lemma is immediate. 

Lemma 7.6. Let V be a 2s-sparse-recovery set. Then V is also a s-advice-sparse-recovery set. 

To our knowledge, the existing work on Prony's method gives an algorithm for perform sparse- 
recovery. However, in our reduction advice-sparse-recovery is more natural. The above lemma shows 
that these notions are equivalent, up to a loss in parameters. However, to get better constructions we 
detail how to modify Prony's method to achieve advice-sparse-recovery without a loss in parameters. 

Algorithm 1 Prony's method with an advice set 



procedure PRONYSMETHOD(n,s,S',y,{fl'o, • • • ,5n-i}) 
if IS"! odd then 

Enlarge S by 1 position 
end if 



2: 
3: 
4: 
5: 

6: 

7: 
8: 
9: 

10: Define =^ Yll=o ^i^^ 

11: T = {k\p{gk) = 0} > T win be Supp(x) 

12: D G F2^><^, Di/=gl, for k£T 

13: Solve Dz = y for z (using Gaussian Elimination) 

Zk if A; G T 



Construct A G fis+tMs+t+i) ^ ^. .drf 1 ^l, if * < l-^l ^ ^ ^ ^^^^ _ _ _ ^ 

[yi+j-\s\ else 
Convert A to row-reduced echelon form 

Let r G [s + 1] be the largest number so the r x r leading principal minor of A is full rank. 
Let c G F^"*"^ be a non-zero vector in the nullspace of leading r x (r + 1) minor of A. 



14: Define x G F", as xj^ 

15: return x 
16: end procedure 



else 



Theorem 7.7. Let ¥ be a field, and let go, . . . , gn^i G F 6e distinct. Let G F*^ be the vector 
with entries {vi)j'^gj. Then the set V = {'Vi}'^^^ is an s-advice-sparse-recovery set. Further, 
PRONYSMETHOD(n, s, 5", yx, {(/o, • • • , 5n-i}) (Algorithm 1) recovers x in 0{s'^ -\- sn) operations 
(where operations over F are counted at unit cost), where V G F^*^" is the matrix with the vectors 
in V as its rows. 

In particular, if g £¥ has order at least n, we can take gj = g^ . 

Proof. As above, define V G F^**^" to be the matrix whose rows are those vectors Vj. That 
is, Vij = gj. As the gj are distinct, it follows that every 2s x 2s minor of V is an invertible 
Vandermonde matrix. It follows that each subset of < 2s columns of V are linearly independent. 

Define gj G F^^ by {gj)i = gj. It follows that the gj are the columns of V. For a vector a G F'", 
define at^'^^l G F*^"^"*"-^ to be the vector with entries a^, . . . , a^. 

V is a s-advice-sparse-recovery set: Consider a set 5 G (<2s) ^^'^ vectors x, w G F" where each 
have at most s — 151/2 non-zero entries outside of S. Suppose that Vx = Vy. By linearity, this 
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yields the vector x — w such that V{x — w) = and x — w has at most 2(s — \S\/2) non-zero entries 
outside of S. In total, x — w has at most \S\ + 2(s — |5|/2) = 2s non-zero entries. However, as 
mentioned above, each subset of < 2s columns of V are linearly independent. As = ^(x — w) is 
a linear combination of < 2s columns of V, it follows that x — w = 0. Thus, any such x is uniquely 
determined by S and Vx. 

Algorithm 1 performs recovery: Consider a set S G {<2s) ' "^^^^ ^ — ■ ■ ■ ^|5|-i}- For a^iy 
vector X the condition that | Supp(x) \S\ < s — |5|/2 implies that | Supp(x) \ 5| < s — [| 5*1/2] by 
integrality. It follows that we may assume the set S has even size, as we can always enlarge it by 
one position without changing the above constraints on the support of x. (If S = \n\ prior to this 
enlargement, we simulate n + 1 long vectors). Now define t so \S\ = 2t. 

Consider vector x S F" with at most v < s — \S\/2 = s — t non-zero entries outside of S. By 
construction of y (recall y = Fx) , 



Xkg,k + ^ Xkg,k (3) 



E 

fceS fe6Supp{x)\5 



The aim of this analysis will be to show that we can determine Supp(x) and then leverage this to 
solve the above equation for x. 

We now establish some theory to analyze the algorithm. The above equation can be refined to 
see that 

yi"'i = E-^4"''+ E -.gr' = E-'f^4°'''-"+ E -•'A^r'-" (4) 

fces" fceSupp(x)\s fces fceSupp(x)\s' 

We note here that the rows of A involving y can be written as y['''*+*l, . . . ^ y[*-*-i.2s-i] ^ As y has 
2s entries, each of these vectors is well-defined, and each entry in y is used in A. 
We now establish some claims about A using that v = \ Supp(x) \ 5|. 

Claim 7.8. The {\S\ +1^+1) x (IS"! + z/ + 1) leading principal minor of A is singular. 

Proof. Denote this leading minor by M. The rows of M are of the form g^^'''^''''''' for j < \S\, 

and y[^'|5|+-+^] for < £ < i.. TVivially, for each j < \S\, gg'l^'l+^l G Span{g!°'l^l+"]}fceSupp(x)u5- 

Further, Equation 4 shows that y[^'l'5'l+'^+^] g Span{g|[''''^'^'^^}fcggupp(x)us- Thus, the |5| +u + l rows 
of M each lie in a < (IS*! + i/)-dimensional subspace, implying that M is singular. □ 

Claim 7.9. The {\S\ + z^) x (IS*] + v) leading principal minor of A is invertible. 

Proof Denote this leading minor by M. We wih show that M = BC, for B,C e f(I'5|+'^)x(I'S'I+'^) 
both invertible, which implies the claim. 

Let the rows of C be the vectors gj!''''^'^'^ ^\ for each k G Supp(x) U S. We will index the rows 
by the gk, and assume that the first \S\ such gk are those with k G S. This is a Vandermonde 
matrix, and as such is invertible. 

Let B be defined by 

'l ifi = k<\S\ 

ifi^k,i<\S\ 



^Xkgl else 



It follows from Equation 4 that M = BC. Note that B has the form 
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where Xi S fI'^I^I'^I is the diagonal matrix with diagonal entries x^, for k €z S (ordered to match 
C), X2 G F''^'^ is the diagonal matrix with diagonal entries Xk, for k £ Supp(x) \ S (ordered to 
match C), E £ F'^^l'^l is the Vandermonde matrix with entries Ei g^=g^^, for k £ S, and F G F'^^'^ 

is the (invertible) Vandermonde matrix with entries Fi^g^'=gJ^ for k £ Supp(x) \ S. 

Note that Xi might entirely be zero, but X2 must be invertible by assumption that x has 
exactly v non-zero entries outside of S. As F is invertible, it follows that FX2 is invertible, and 
thus B is also invertible. 

Thus, M = BC with B and C both invertible matrices. The claim follows. □ 

As the first |5| rows of A are rows of a Vandermonde matrix, it follows that the first [S'l leading 
principal minors are all invertible. This, along with the above two claims, thus show that IS*! -|- 
is the minimum r such the {r x [r + 1) leading principal minor of A is singular. It follows that 
in Algorithm 1 the r value chosen in Step 8 is in fact IS"! -|- v. 

We now show that the c chosen by the algorithm also has significance. 

Claim 7.10. ^et p(x) =^ nfceSupp(x)u5(^ - = Hflt" Cix\ Then the vector c £ fI'^I+'^+i defined 
by those coefficients Ci is in the nullspace of the (\S\ + v) x (\S\ +1^ + 1) leading minor of A. 

Proof. Denote this leading minor by M. 

Note that for any g^ with k £ Supp(x) U S has that (g^^'''^'~'''^', c) = 0, as this simply says that 
Pi9k) = 0. Thus, we see that c is orthogonal to the first |5| rows of M. 

Now observe that Equation 4 shows that the last rows of M are all in the span of the vectors 
^lo,\S\+u] j.^^ ^ ^ Supp(x) us. As c is orthogonal to each of these vectors by construction, we see 
that it must also be orthogonal to the last i' rows of M. 

Thus, c is orthogonal to each row of M, and thus is in its nullspace. □ 

The algorithm chooses some c that is in the nullspace of the (jSI -|-z^) x (IS*! -|-z^-|-l) leading minor 
of A. However, as the (|5| + i') x (IS*! -|- ly) leading principal minor of A is invertible, it follows that 
the (|5| + I') X (|5| + I' + 1) leading minor of A has a nullspace of dimension 1. Thus, the c chosen 
by the algorithm must be a (non-zero) multiple of the coefficient vector of nfceSupp(x)u5(^ ~ 9k)- 
It follows that the set T is equal to Supp(a;) U S. 

Thus, Equation 3 gives a linear system for y with < 2s variables, and 2s equations, where x 
(restricted to Supp(x) U S) is a solution. The system is full-rank, so x is the only solution. Further, 
x can be recovered via Gaussian Elimination, and this is exactly what Algorithm 1 does. Thus, 
correctness is also established in this case. 

Algorithm 1 runs in 0(s^ -|- sn) operations: Constructing the matrix A takes 0{s'^) operations, 
as that is the size of the matrix and each entry can be computed in 0{1) operations (the gf^ are 
computed with i increasing). Converting A to reduced-row echelon form takes 0{s^) operations. 
Determining the number r in Step 8 also takes 0{s) operations, as r = max{z|^j^j 7^ 0}. Deter- 
mining the vector c takes 0{s) because the r x (r -|- 1) minor is row-reduced echelon form. That is, 
for 1 < i < r, Cj = —Ai^r+i and c^+i = 1. Constructing p and T takes 0{sn) time, as we just test 
if p{gk) = for each fc, and p is of degree 0{s). D is a Vandermonde matrix with at most 0{s^) 
entries, and so constructing D takes 0{s^) steps. Solving for z takes O(s^) steps, and determining 
the final x takes 0{n) steps. □ 

This theorem provides us with an s-advice-sparse-recovery set, using 2s measurements. We will 
now leverage this in the next subsection to get a full algorithm for low-rank-recovery. 
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7.2 Low Rank Recovery 

In this subsection we describe how the problem of (exact, non- adaptive) r-low-rank-recovery de- 
terministically reduces to the problem of (exact, non-adaptive) r-advice-sparse-recovery. We will 
first define a normal form for a matrix which we call (< k) -upper- echelon form, which (recalling 
the notation of Section 2) is roughly defined as saying that a matrix M has M^*^^) in reduced 
row-echelon form. We then show that for any matrix M in this form, the diagonal M^^^ is sparse. 
Thus, using sparse-recovery we can then recover this diagonal. This process is then continued by 
using row-reduction to put M in (< A;)-upper-echelon form, and then recovering M(^+i) and so on. 

The above process uses the sparse-recovery oracle in an adaptive way. The algorithm we detail 
below will actually use the sparse-recovery oracle non-adaptively. The measurements made to the 
matrix M will be the sparse-recovery oracle applied to each /c-diagonal. While these diagonals 
are not themselves sparse, we show that the row-reduction of M (that makes M into upper- 
echelon form) acts such that we can simulate the adaptive measurements from the non-adaptive 
measurements by computing the suitable corrections. 

We now begin by describing some structural properties of matrices, which we will apply to 
understand upper-echelon form. 

Definition 7.11. Let M be an n x m matrix. The entry is a leading non-zero entry, if 

Mi J / and Mij' = for f < j. 

Denote LNE(M) to be the set of all such leading non-zero entries. If S is a subset of entries in 

M, denote LNE(5) =^LNE(M) n S. 

Denote LNEjj(S') to be set containing the rows of the coordinates in LNE(S'), and denote 
LNEc(S') to be the multi-set containing the columns of the coordinates in LNE(S'). 

It is clear that each row can have at most one leading non-zero entry, and possibly none. A 
column could be associated with several leading non-zero entries. 

Definition 7.12. An n x m matrix M is in (< k) -upper- echelon form if, for each {i,j) S 
LNE(M«'^)), Mi>j = for alii <i' < k- j. 

Note that a matrix is (< A;)-upper-echelon if it is (< /c')-upper-echelon and k' > k, and that 
every matrix is vacuously in (< 0)-upper-echelon form. 

We now recall the following standard linear-algebraic fact about triangular systems, phrased in 
the language of leading non-zero entries. 

Lemma 7.13. Let M be an n x m matrix with all non-zero rows, such that LNEc(M) has no 
repetitions. Then the rows of M are linearly independent. 

Proof. Denote the column of the leading non-zero entry of row i by ji . Each row must have such a 
value as each row is non-zero. As linear independence is invariant under permutation, we assume 
without loss of generality that the rows are ordered such that the ji are strictly increasing with i. 
This is possible as the ji are assumed to be distinct. Write these rows as vectors v*^*). Now consider 
any non-trivial linear combination CjV^*^ Pick to be the least number such that Cjp 7^ 0. As 
the ji are strictly increasing, it follows that the jig-th entry of v^*) is zero for i > zq. Thus, we now 
expand out the zg-th index of the above summation 

(E ^^""ho = E v« + c.„vg) + E v« = . v« + c.„v(-) + E = c„v5-) ^ 

i i<io io<i i<io *0<* 

Thus we see that this linear combination is non-zero, and as this was any non-trivial linear combi- 
nation it follows these rows are linearly independent. □ 
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We now show that matrices in upper-echelon form cannot have many leading non-zero entries. 

Lemma 7.14. Let M be an n x m matrix of rank < r. If M is (< k) -upper- echelon, then 
|LNE(M(<''))| < r. Further, LNEc(M(<'')) has no repetitions. 

Proof Given {i,j) G LNE(M«'=)), (< k) -upper-echelon form implies that Mj/ j = for any i' with 
i < i' < k — j. It follows that given two distinct entries {i,j), {i' ,j) G Af^^^^ at most one can be a 
leading non-zero entry. Thus we see that LNEc(iVf ('^'^)) has no repetitions. 

Lemma 7.13 then implies that the rows in LNE/j(M''^'^^) are linearly independent. Thus, 
I LNE(M«'^))| < rank(M) < r. □ 

The next lemma is the key insight of the algorithm. It shows that, for any matrix in (< k)- 
upper-echelon form, the A;-diagonal must be sparse. Further, the sparseness is bounded by twice 
the rank of the matrix (the lemma presents a more refined statement). 

Lemma 7.15. Let M be an n x m matrix with rank < r, such that M is in (< k) -upper- echelon 

form with 0<k<n + m-2. Let /= | LNE(M«'=))|, LNE/j(M«^)), /= LNEc(M«'=)). 

Then M'^^^ has < r — s non-zero entries with columns outside S = {k — I)LI J, and thus M^^^ is 
(r + s) -sparse. 

Proof. Note that by Lemma 7.14 we have that s < r, so that r — s > and r + s < 2r. 

Let /' be the rows that contain non-zero entries in M^^\ whose columns lie outside S. We 
will show that the rows in / U /' are linearly independent. This will complete the claim as |/'| < 
rank(M) — |/| < r — s, and observing that l^l < 2s. 

Now consider the columns of the leading non-zero entries of the rows in /'. Any row i £ I 
intersects M^'^) at column k — i £ S. This means that row i cannot contain a non-zero entry in 
M^'^^ with column outside of S, so / and /' are disjoint. 

Any row i with a non-zero entry in M^'^'^^ must have a leading non-zero entry in M^^^\ and 
thus any such i is contained in /. Thus, as / and /' are disjoint, it follows that any row i' G /' only 
has zero entries within M^^^\ As such a row i' has a non-zero entry on AfC'), it follows that the 
leading non-zero entry of a row i' € /' is {i\k — i'). This implies that the columns of the leading 
non-zero entries of the rows in I' are distinct (and outside of S by construction). 

The rows in / have leading non-zero entries in J C 5 and by Lemma 7.14, J has no repetitions. 
Thus, it follows that the rows I L) I' all have distinct columns for their leading non-zero entries, 
which, by Lemma 7.13, implies that these rows are linearly independent. Invoking the rank bound, 
as mentioned above, completes the proof. □ 

This lemma motivates the following idea for low-rank reconstruction. Iteratively, convert (using 
row-reduction) the matrix into (< A;)-upper-echelon form and then reconstruct, using any sparse- 
recovery method, the A;-th diagonal. This is exactly the algorithm we will present. However, to 
establish correctness, we need to first understand how to convert a matrix into (< A;)-upper-echelon 
form, even in situations when M^-^^ is unknown. 

To do this, we will use row-reduction, as implemented by left-multiplication by lower-triangular 
matrices. The following lemma shows that such multiplication can be computed on the partial 
matrices M^^''\ 

Lemma 7.16. Let M be an n x m matrix, and L be an n x n lower-triangular matrix. Then 
{LM)^^^^ is computable in C'(min(n, fc) min(m, A;)fc) arithmetic operations from L and M^^^\ 
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Proof. An entry {LM)ij, for i + j < k, is equal to Yl?=o ^i,i^^J^ which equals J2\=i ^ 
L is lower-triangular. Further, this sum is computable from L and the (< /i;)-diagonals of M as 
l+j < i+j < k. The time bound is the obvious bound on computing each of C'(min(n, k) min(m, k)) 
sums of < A; terms. □ 

We now establish a useful property on composing left-multiplication of special types of lower- 
triangular matrices. 

Lemma 7.17. Let L,L' be n x n invertible, lower-triangular matrices, with all I's along the main 
diagonal. Then LL' is an invertible, lower-triangular matrix, with all 1 's along the main diagonal, 
Further, if both L — In and L' — In only have non-zero entries in a subset J of the columns, 
then LL' — In also has this property. 

Proof. That facts that LL' is an invertible, lower-triangular matrix and has all I's along the main 
diagonal, are each straightforward. 

We now prove the desired property of LL' — In- Consider some entry {i,j) in LL', with j ^ J 
and i > j. It is then that 

{LL')ij = ^ LikL'f^j = ^ Li^L'i^j = Li^iL'ij + ^ Li^L'^^j + LijL'jj 

fce[n] i>k>j i>k>j 

= 1 ■ L'i j + ^ Li^kL'kj + Lij ■ 1 

i>k>j 

Observe that as i > j and j ^ J, L'- j = L'^^ ■ = Lij = (for any k > j). Thus, the above sum is 
zero. Hence, the desired entries (i,j) with i > j and j ^ J are zero, proving the claim. □ 

We now use these lemmas to analyze Algorithm 2, which gives a way to transform a matrix in 
(< A;)-upper-echelon into one which is (< A;)-upper-echelon, and does so efficiently. 

Algorithm 2 Transform a (< A;)-upper-echelon matrix into (< A;)-upper-echelon form 



procedure MakeUpperEchelon(M, n,m,/c) 
L^ In 

for all {i,j) G LNE(M«'=)) do 

L ^ (In — ^^^,f{'^ E}^_j^i) ■ L > Mjj 7^ as (i, j) is leading non-zero entry in row 

end for 
return L 
end procedure 



Claim 7.18. Let M be an n x m matrix of rank < r, such that M is in (< k) -upper- echelon form, 
for < k < n-\-m — 2. Then the procedure MakeUpperEchelon(M, n, m, k) (Algorithm 2) runs 
in 0{rn) time and returns an invertible nxn lower-triangular matrix L computed only from M^-^\ 
such that LM is (< k) -upper- echelon and (LM)^^''^ = M'^^^\ 

Also, L is the product of < r elementary matrices and each main diagonal entry is equal to 1. 

Further, L — In only has non-zero entries with columns in LNE/j(M*^^'^)). 

Proof. {LM)^<^^ = M(<^): We argue that the identity {LM)^<^^ = m(<^) is invariant. As L = In 
initially, the identity holds at the beginning of the algorithm. We now proceed by induction. 

In each run of Line 4, we add a multiple of row i to row k — j in LM, where {i,j) G LNE(M(^'^)) 
and thus i -\- j < k. Thus, row i in M has the first j — 1 entries being zero. By induction on the 
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identity (LM)(<'=) = M^<^\ the first j - 1 entries in row i of LM are also zero wlien Line 4 is run. 
It follows that the only action of this update to (LM)^^'') is to set {LM)k-jj = 0. Thus, (LM)(<*^) 
is unchanged, so (LM)(<'^) = M^^'^) still holds. 

LM has (< /c)-upper-echelon form: As (LM)^^''^ = M(<'^) throughout the algorithm, and M is 
in (< fc)-upper-echelon form, it follows that LM is in (< A;)-upper-echelon form at termination. To 
show LM is in (< A;)-upper-echelon form upon termination, it suffices to show that {LM)k-j,j = 
for all j e LNEc(M«'^)). As running Line 4 has exactly this effect (and these updates are disjoint 
and idempotent, thus do not conflict), and this line is run for ah G LNE(M«'=)), it follows 

that LM is in (< A;)-upper-echelon form on termination. 

L computable from the (< /i;)-diagonals of M: This is straightforward, as each query to M is 
within the (< A:)-diagonals. 

L is the product of < r elementary matrices: Each update to L by Line 4 left-multiplies L by 
an elementary matrix. By Lemma 7.14, | LNE(M'^^''))| < r, so the loop of the algorithm is run at 
most r times. 

Structure of L: By construction, L is the product of matrices of the form /„ + cEi^_j^i, where 
i + j < k and G LNE(M("^^)). Regardless of the value of c, such a matrix is invertible, 

lower-triangular, with main diagonal entries all 1, and all non-zero entries of (/„ + cEk_j^i) — In 
have columns in LNE/j(M(^'^)). By Lemma 7.17 it follows that L also has these properties. 

Complexity: Left-multiplication by an elementary matrix can be done in 0{n) steps, and by the 
above analysis, there are < r such multiplications. Further, by storing the leading non-zero entries 
in each row, the pairs can be determined in 0{n) time. Thus the time is 0{rn) overall. □ 

We now present the low-rank recovery algorithm, and its analysis. 

Algorithm 3 Reconstruct a matrix from inner-products {(M, i?)}/jg7^^^o<fc<ri+m-2 

1: procedure LOWRANKRECOVERY(n,m,{(M, i?)}/jg7ej^^0<fc<n+m-2) 
2: L^ In 

3: N ^ O"^'" 
4. p ^ Qnxm 

5: for 0<A;<n + m — 2do 
6: 0"^™ 
7: ^ ((L - /„)7V)(*^) 

8: S ^ [k- LNE/j((P«'=)))) U LNEc(P(<*^)) 

9: ^SRfc({ (M,i?) + (A 

11: Lfc MakeUpperEchelon(P, n, jn, k) 

12: ^ (LfcP)W 

13: L ^ LkL 

14: end for 

15: end procedure 



Theorem 7.19. Let m > n > r > 1. For 0<k<n + m — 2, let TZt he sets of n x m matrices 
such that 

L For k' / k, R^^'^ = for R £ TZk 

2. {i?^^''}ijg7^^ forms a min(r, k + 1, (n + m) — {k + 1))- advice- sparse-recovery set. 



i> Update LNE(P(^'=)) 
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Then TZ = IJfc'^fe r -low-rank-recovery set. 

If, for each k, the set {R^^^^ r^-ji^ has an min(r, /c + 1, (n + m) — {k-\- l))-advice-sparse-recovery 
algorithm SR^ running in time tk, then Algorithm 3 performs r -low-rank-recovery for TZ in time 
0{rnm + Ylt2{tk + n\nk\)). 

Proof. We will first show that TZ is an r-low-rank-recovery set by showing that Algorithm 3 performs 
recovery, assuming oracle access to r-advice-sparse-recovery oracles SRfc. We will then analyze the 
run-time. 

Claim 7.20. The following invariants hold at Line 14, at the end of the loop. 

1. Ar{<fe) = Af {<fc) 

2. = {LM)^^^^ 

3. P is in (< k) -upper- echelon form 

4. L is lower-triangular, invertible , main diagonal is all I's, and L — In only has non-zero 
entries with columns in LNE^(P^^'^'') 

Proof. The proof will be by induction. 

k = 0: The loop begins with L = In, N = Ony P = ^n- It follows that A = 0^ in this run of 
the loop, and that 5 = 0. Thus, is set to SRo{{{M, R)}r(ztio,9)- As r > 1, we get that 7^[,°^ 
is a 1-ad vice-sparse-recovery set and as M^^^ has at most 1 element, it follows that SRq recovers 
it correctly and thus = M(°) after Line 9. As A = 0„, it follows that 7V(-°) = M(^°) also, 
satisfying Invariant 1. 

Now observe that the procedure MakeUpperEchelon, when run on fc = 0, will always return 
In- Thus, Lfc, and L, are both /„ at the end of the loop, satisfying Invariant 4. Invariant 3 is 
vacuously true as any matrix is in 1-upper-echelon form. Finally, using that L = = In, we see 
that P is unchanged after Line 9 and so P^-^^ = (LM)^-^\ satisfying Invariant 2. 

k > 0: Using that the invariants held at k — 1, we now establish them at k. As P^'^'^) = (LM)*-^*^^ 
and P is in (< A;)-upper-echelon form, it follows that LM is in (< A;)-upper-echelon form. By 
Lemma 7.15, it follows {LM)^^^ has at most r — s/2 non-zero entries with columns outside of 
S = (/c - LNEij((LM) U LNEc((LM)«*^)), where s = | LNE((LM)«'^))| and |5| < 2s. 
However, using again that p(<'^) = (LM)^^'^) it follows that (LM)^'^) has at most r — \S\/2 non- 
zero entries with columns outside of S, where S is as constructed in Line 8. As (LM)^''^ has 
min(fe + 1, (n + m) — (A;+ 1), ra) non-zero entries total, and TZ^ is an min(r, A; + 1, (n + m) — (fc + 1))- 
advice-sparse-recovery set, it follows (as r < n) that SKk{{{LM, R)}ji^ii^, S) successfully recovers 
{LM)^^\ That is, if r 7^ min(r, /c + 1, (n + m) — (/c + 1)) then we have enough measurements to 
fully recover {LM)^^^ regardless of its sparsity and the value of ^(and the oracle will perform this 
recovery), and if r = min(r, k + l,{n + m) — (/c + 1)) then we use the advice-sparse-recovery oracle. 

We now use the following claim to show how the {{LM, R)} can be computed. 

Claim 7.21. At the beginning of the loop in Line 5, {LM)^^^ = M'-''^ + {{L - In)N)'~^'^ 
Proof. As LM = M + {L - In)M, it is enough to show that ((L - In)M)^^'^ = {{L - In)N)^^\ 
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By induction on the above invariants, L is lower-triangular, with all I's along the main diagonal, 
and N(<^^ = M'-<^\ Thus, (L - In)i/ = for i < I, and Mij = N^j foi £ < k - j. For any j < k, 

{{L - In)M)k-jj = ^ (L - In)k-j,iMi,j = ^{L- In)k-j,iMej = ^ (L - In)k-j,eNej 
£eM e<k~j e<k-j 

= ^ (L - In)k-j,eNgj = {{L - In)N)k-j,j 

Thus ((L - /„)M)('^') = ((L - In)N)(^\ giving the claim. □ 

The above claim shows that at Line 9 we have that {LA4, R) = {M, R) + {A, R), for all R G TZk, 
using that R^'''^ = for /c' 7^ A;. This shows that Line 9 correctly implements advice-sparse- 
recovery of {LM)'^^\ and thus sets P^^-* to this value. It follows that at the end of this line that 
= (LM)(^*^). 

Invariant 1: Using the identity proved in the above claim, and the just proven fact that P^-^^ = 
{LM)^^^^ at the end of Line 9, it follows that at the end of Line 10 that N^^^ = M'^^\ and thus 
]^{<k) — M^-'^\ As N is not changed further, this establishes Invariant L 

Invariant 3: We now examine Lines 11-13. As P has only changed in its fc-diagonal, it is still in 
(< A;)-upper-echelon form. Thus, Line 11 returns such that L^P is in (< A:)-upper-echelon form, 
by Claim 7.18. Further [L]^P)^-^^ only differs from P^-^^ along the /c-diagonal, so it follows that 
after the update in Line 12 that P is in (< A;)-upper-echelon form. As P is not further modified, 
this establishes Invariant 3. 

Invariant 2: Further, as we take L ^ LkL in Line 13 and previously had that = [LM)^-^\ 

it follows that at the end of Line 13 we have that P^-^'^ = {LM)^^^'^ still, as both P and LM have 
been multiplied by Lk- This establishes Invariant 2. 

Invariant 4: In Line 11, Claim 7.18 shows that is a lower-triangular and invertible matrix, 
with main diagonal entries all I's, and Lk — In only has non-zero entries in columns LNE/j(P''*-'^^). 
As P^'^) is not modified further, this remains true at the end of the loop at Line 14. By induction, 
at Line 5 we have that L is lower-triangular, invertible, with main diagonal entries all I's, and 
L — In only has non-zero entries in columns LNE/^(P(^(*^~^))). As p(<(*^~^)) remains unchanged 
throughout this iteration of the loop, this is also true at the beginning of Line 13. By Lemma 7.17, 
it follows that after Line 13 L still has the properties of being lower-triangular, invertible, main 
diagonal entries being I's, and L — In only has non-zero entries in LNE/j(P«'=)). This establishes 
Invariant 4 

Thus, each of the invariants are established for this value of k given that they hold for A; — 1, 
so the invariants hold for all k by induction. □ 

The above claim shows that at the end of the algorithm, A^(-'^) = M^-^^ for k = n + m — 2. 
But this implies N = M, and thus M is reconstructed successfully. 

Run-time Analysis: We now bound the run-time of Algorithm 3. The steps outside the for-loop 
take 0{nm), so it suffices to bound each step of the loop. We will show that each step of the loop 
takes 0{rn + t/^ + f^l'^fel) steps. As there are n + m such iterations of the loop, the quoted bound 
follows. 

We begin by noting that the algorithm will not recompute LNE(p(^'^)) at each stage. Instead, 
this will be maintained throughout the algorithm. As each row of P can have at most one leading 
non-zero entry, this is easily stored and indexed. Further, as P^^^^ = [LM)^^^^ and the rank bound 
on M shows, via Lemma 7.14, that | LNE((LM)«*^))| < r, it follows that if the set LNE(P«'=)) is 
maintained as a linked list, that traversing it entirely takes 0{r) time. 
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Note that we do not need to modify LNE(p(<*^)) when running MakeUpperEchelon, and 
can defer modification to after Line 12. At that point P^^''^ has been determined, and can be used 
to compute LNE(P«(''+i))) = LNE(p(^^^)) in 0{n) time. Thus, LNE(P«'=)) can be maintained 
within the quoted time bounds, and accessed as a 0{r) sized hnked hst. 

We now analyze the lines of the loop. As written. Line 6 takes Q(nm) time, which is above the 
quoted run-time bounds. However, one can observe that A is only ever accessed at the values A^''\ 
when noting that R G TZ^ is only non-zero on its A:-diagonal. Thus, Line 6 is actually superfluous 
and can be omitted. 

Line 7 takes 0{rn) steps. For, the above invariants show that L — In only has non-zero entries 
in the columns LNE/},(P'^^*^^), and as discussed above this set has at most r elements. Thus, each 
of the < n) elements of A^^^ is the sum of < r elements of A^. Thus A^''^ can be computed in 0{rn) 
steps. 

Line 8 takes 0{r) steps, as LNE/j(P(<'^)) is pre-computed. 

Line 9 takes 0{tk + n\Tlk\) steps. For, each inner product {A, R) takes 0{n) steps (as each 
matrix is only non-zero on the /c-diagonal, which has at most n entries), and there are \TZk\ such 
inner-products. Running SRfc takes steps, by definition. 

Line 10 takes 0{n) steps, as the /c-diagonal has at most this many entries. 

Line 11 takes 0{rn) steps by Claim 7.18. 

Lines 12 takes 0{rn) steps, for as used above, — /„ has only non-zero entries with columns 
in LNEj:j(p(^'^)), so each entry in {LkP)^^^ is the sum of at most r + 1 products of entries in Lk 
and P, and these products are determined by LNE/j(p(^'^-*). As there are at most n such entries, 
the bound follows. 

Line 13 takes 0{rn) steps. This is because L^, by Claim 7.18, is the product of < r elementary 
matrices, and left-multiplication by an elementary matrix takes 0{n) steps. As MakeUpperEch- 
elon computes Lj, as a product of elementary matrices, the computation of L^L can also use this 
decomposition and thus is compute in 0{rn) steps. 

Thus, the entire loop runs in 0{rn + + n\R,k\) steps, and there are at most n + m iterations 
of the loop, giving the bound. □ 

We now apply this reduction to our hitting set ^?2rnm' which embeds the sparse-recovery 
measurements corresponding to the dual Reed-Solomon code. 

Corollary 7.22. Let 1 < r < n/2, m > n > 1. Then I^2rn m. (from Construction 5.7) has 
1- l^2r,n,,nl = 2(n + m - 2r)r 

2. Each matrix in nm n-sparse. 

3. ^^2rnm ^ r -low-rank-recovcry set 

4- Algorithm 3, combined with Algorithm 1, performs low-rank-recovery for I^2rnm time 
0{rnm + [n + m)r^) 

Proof. (1): This is by construction. 

(2) : Each matrix in 'E'2rn m its support contained in some fc-diagonal, and each /c-diagonal 
has at most n elements. 

(3) : We will first show that the measurements that ^^2rn m performs on each fc-diagonal comprise 
a min(2r. A; -|- 1, (n -|- m) — [k + l))-advice-sparse-recovery set. 

First consider the case when k + 1 <2r <n. Then min(2r, k-\-l, {n-\-m) — (A;-|- 1)) = k + 1, and 
^2rnm pl^ccs k + 1 Constraints on this A;-diagonal M^^\ which has k + 1 entries. The constraint 
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matrix V is of size {k + 1) x {k + 1) with V^j = g^K As g has order > n, the elements 1, g, . . . , g'^ 
are distinct. So these constraints form an invertible Vandermonde system and so M^''^ (regardless 
of the rank of M) can be completely recovered from these measurements. In particular, V forms a 
(fc + l)-advice-sparse-recovery set. As the Vandermonde system can be inverted in 0{k^) = 0{r^) 
time, we see that {k + l)-advice-sparse-recovery can be performed in this time. 

Similarly, now consider the case when (n + m) — {k + 1) < 2r < n (so it follows that m < k). 
Then min(2r. A; + 1, (n + m) — (A; + 1)) = (n + m) — (A; + 1), and 'C2r,n,m places (n + m) — (A; + 1) 
constraints on this A;-diagonal M^'^\ which has (n + m) — (A; + 1) entries. The constraint matrix V 
is of size ((n + m) — {k + 1)) x {{n + m) — {k + 1)) with V^j = (^^('=-("^-1)+^). As g has order > n, 
the elements . . . ,g"'~^ are distinct. So these constraints form an invertible 

Vandermonde system and so M^''^ (regardless of the rank of M) can be completely recovered from 
these measurements. In particular, V forms a ((n + m) — (k + l))-advice-sparse-recovery set. As 
the Vandermonde system can be inverted in 0(((n + m) — {k + 1))^) = 0{r^) time, we see that 
((n + m) — {k + l))-advice-sparse-recovery can be performed in this time. 

Now consider the general case when 2r < A; + 1, (n + m) — (A; + 1). Then min(2r. A; + 1, (n + 
m) — (A; + 1)) = 2r, and ^?2rn m places 2r constraints on this A:-diagonal M^^\ which has min(A; + 
1, n, (n + m) — (A; + 1)) entries. The constraint matrix V is of size 2r x min(A; + 1, n, (n + m) — (A; + 1)) 
with Vi^j = g^(max(o,fe-(m-i))+j)_ ^ y^bs. Order > n, the elements 

max(0,fc— (m— 1)) max(0,fc— (m— !))+! max(0,fc— (m—l))+min(fc+l,n, (n+m)— (fc+l)) — l 

y ) y : ■ ■ ■ T y 

are distinct. Thus, it follows from Theorem 7.7 that V is a r-advice-sparse-recovery set, and that 
recovery can be done in 0{r^ + n) steps. 

Thus, by Theorem 7.19, it follows that ^?2rnm 1^ a r-low-rank-recovery set. 

(4): By the analysis done for (3), we see that Theorem 7.19 shows that Algorithm 3 (along with 
the r-advice-sparse-recovery performed by Algorithm 1) yields a 0{rnm+{n + m)r^)-tmie recovery 
algorithm for V^r^n^m- ^ 

Remark 7.23. We briefly note that for r > n/2 we have that |'P2rn.ml — ''^"^ (one cannot use the 
formula "I Pgrn. ml = 2(n+m— 2r)r" here, but the bound [Pgr n ml — l^2r,n,m| = 2(n+m— l)ris still 
valid). Thus, for r > n/2 there is no gain from using ^ ^ over the obvious nm low-rank-recovery 
set that queries each entry in the matrix. 

Remark 7.24. One can also use Algorithm 3 to reprove Theorem 5.8, that is, to reprove that Pr.n.m 
is a hitting set (note that we use r and not 2r here). To do so, note that Lemma 7.15 shows that 
for a rank < r matrix M, if M^^^^ = then M^^^ is r-sparse. 

Thus, if {M,'Dr,n,m) = then this implies that for each k, {M^^\TZk) = 0, where TZk is the 
r-sparse-recovery set formed from the dual Reed-Solomon code. So if M*-'^-* is r-sparse then by the 
properties of TZj- it must be that M^^'^ = 0. 

Combining the two observations above, we see that M^'^) = ^ M^'^) = 0, and thus 
M{<k) = ^ m(^*^) = 0. Inducting on k shows that M = 0„xm- Thus, if M / and M is rank 
< r then {M,Vr^n,m) 7^ 0, showing that Vr^n,m is a hitting set. 

Given that 'D2r n m admits efficient low-rank-recovery, we can recall the above results that show 
that these measurements are equivalent to the 'B2r,n,m measurements. Thus, we also get that this 
second set admits efficient low-rank-recovery. 

Corollary 7.25. Let 1 < r < n/2, m > n > 1. Then Sg^^^ (from Construction 5.9) has 
1- l^2r,n,ml = 2(n + m - 2r)r 
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2. Each matrix in B'^^ nm ''■^ rank I . 

3. B2rnm '^^ ^ r -low-rank-recovery set 

4- Algorithm 3, combined with Algorithm 1, performs low-rank-recovery for in time 

0{rm? + mr^) 

Proof. (1): This is by construction. 

(2) : This is also by construction. 

(3) : By Theorem 5.10 and Theorem 5.8 we see have that SpanP^^^ = SpanS[,„^. In partic- 
ular, the measurements (M, P^.^^) can be reconstructed from the measurements (M, Span „ ^) . 
As the above corollary shows that 'D'rnm r- low-rank-recovery set, it follows that ;B^nm also. 

(4) : The analysis given in Theorem 5.10 gives an algorithm for reconstructing the measurements 
(M, „ from the measurements (M, SpaniS^ ,^ and does so interpolating r polynomials of 
degree < n+m. As evaluations of these polynomials takes 0{r) steps, and polynomial interpolation 
takes 0{m?) steps for polynomials of this degree, we see that we can complete this interpolation in 
0{rm^ -\-r'^m) = 0{rm^) steps. Once the measurements (M, are computed, we can appeal 
to the above corollary. □ 

The above results only work over fields when we have an element g of large order. However, 
the results of Subsection 6.3 show that we can simulate these results over small fields. Indeed, this 
is also the case here. 

Corollary 7.26. Let m > n > r > 1. Over any field F, there is an po\y (m) -explicit r-low-rank- 
recovery set for n x m matrices, which has size 0{rmlgm) and is such that each recovery matrix 
is 0{n)-sparse. There is also an po\y{m)- explicit r -low-rank-recovery set for n x m matrices, which 
has size 0{rmlg^ m) and is such that each recovery matrix is rank 1. Further, recovery from either 
of these low-rank-recovery sets can he performed in poly(m) time. 

Proof. We begin by noting that both Proposition 6.12 and Proposition 6.16 preserve the property 
of being a low-rank-recovery set, not just that of being a hitting set. That is, each of these 
propositions take a K-matrix H in the original low-rank-recovery set and construct some family 
of F-matrices {-fff/'}^/' such that for any matrix M, {M,H) can be efficiently recovered from the 
sums a£{M, Hi)}i', for some coefficients ai G K. Thus the measurements {M,7i) are efficiently 
recoverable from the measurements {M,T-L). 

Finally, appealing to the constructions of low-rank-recovery sets as given in Corollary 7.22 
(to which Proposition 6.12 is applied) and Corollary 7.25 (to which Proposition 6.16 is applied) 
completes the claim. □ 



8 Rank-Metric Tensor codes 

We now discuss low-rank-recovery of tensors, for any d, and apply our results to the construction 
of rank-metric codes. We begin with showing that the matrix low-rank-recovery algorithm can be 
extended to the d > 2 case. 

Theorem 8.1. Let n,r >1 and d>2. Then Bd,n,2r, ols defined in Construction 6.10, has 
1. \Bd,n,2r\ < 0(dn(2r)^(ig'^)) 

-2. Bd,n,2r is an r -low-rank-recovery set, and recovery can be performed in time 
poiy((2dn)^(2r)^(ig^)) 
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Proof. (1): This is by construction. 

(2): The hitting set allows us to interpolate the polynomials stated in the hypothesis of The- 
orem 6.6. Once we have the coefficients of this polynomial, we can undo the reductions used in 
the proof of Theorem 6.6. That is, that proof uses Lemmas 6.1 and 6.2 to reshape polynomials 
by merging their variables. This is clearly efficiently reversible. More crucially, the proof uses the 
bivariate variable reduction of Theorem 5.1 for rank < r matrices, but when we take 2r distinct 
powers of g. However, Corollary 7.22 shows that one can recover fM{x,y) from the polynomials 
{fuix, g'^x)}i(z^2r} in P°^y{'i^SxifM),degy{fM),r) steps. As the degrees involved in Theorem 6.6 
are only up to {2dn)'^, this is within the stated time bounds. Thus, we can also reverse the bivari- 
ate variable reduction steps used in Theorem 6.6. Combining these steps shows that we can fully 
recover the entire polynomial frixi, . . . , Xd), which gives the tensor T. □ 

We next observe that, just as with Corollary 7.26, we can perform this low-rank-recovery over 
small fields, when incurring a loss. 

Corollary 8.2. Let n,r > 1 and d>2. Over any field F, there is an po\y{{2nd)'^ , r'^^^^'^^)- explicit 
r -low-rank-recovery set for [[n]*^ tensors, which has size 

C'((in(2r)'^('s'i) . ^^jg 2dn)'^) and is such that 
each recovery tensor is rank 1. Further, there is an po\y{{2nd)'^ , r'-^^^^^^)- explicit r -low-rank-recovery 
set for [nj'^ tensors, which has size C'(dn(2r)^(^S'^)-dlg2dn). Further, recovery from either of these 
low-rank-recovery sets can be performed in po\y{{2nd)'^,r'-^^^^'^^) time. 

Proof. Like Corollary 7.26, we apply Propositions 6.16 and 6.12 to a low-rank-recovery set, where 
here we use the above set from Theorem 8.1. As Propositions 6.16 and 6.12, as well as Theorem 8.1, 
are efficiently implementable, so are the resulting low-rank-recovery sets. □ 

We now apply these results to create error correcting codes over the rank-metric, which we now 
define. We will restrict our attention to linear codes in this work. 

Definition 8.3. A [ln}'^,k,r]f rank-metric code C is a k-dimensional subspace of Y^"'^'' (the 
space of [n]]'^ tensors) such that for all Ti 7^ T2 G C, rank(Ti — T2) > r. Denote r as the distance 
of the code. 

An algorithm Dec corrects e errors against C if for any T G C and E G FH"! with rank{E) < e 
it is such that Dec(T -\- E) = T. 

Thus this is the natural definition for error-correcting codes when we use the rank-metric (notice 
that rank-distance is in fact a metric) as the notion of distance. As we are interested in linear codes 
Ti — T2 £ C also, so an equivalent definition to the above would say that r < rank(T) for all 
^ T £ C. Just as with the Hamming-metric, if we have a distance 2r -\- 1 code C then it is 
information theoretically possible to decode up to r errors. The converse is shown below. 

Lemma 8.4. Let C be a [\nY^ ,k,r']^ rank-metric code that can correct up to r errors. Then 
r' >2r + 1. 

Proof. Suppose not for contradiction. Then there are two tensors Ti 7^ T2 G C such that rank(r2 — 
Ti) < 2r. But then T2 — T1 = Si-\- - ■ S2r, where these Si are all rank-1 (or rank-0) tensors. Then 
it follows that Ti -|- Si -|- • • • -|- S",. is r-close to both Ti and T2, which is impossible as the correctness 
of the decoding procedure indicates that there should be a unique tensor that Ti -\- Si -\- ■ ■ ■ -\- Sr is 
r-close to. □ 

Corollary 8.5. Let¥ be afield, m > n > r > 1. Then there are poly (m)- explicit rank-metric codes 
with po\y (m) -time decoding for up to r errors, with parameters: 
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1. [[[nj X [[?7T,], nm — 2(n + m — 2r)r, 2r + 1]^, if |F| > m, and the parity checks on this code can 
be either all rank-1 matrices, or all 0{n) -sparse matrices. 

2. [IpI X lmj,nm — 2{n + m — 2r)r ■ 0{lgm),2r + any¥, and the parity checks on this code 
are all 0{n) -sparse matrices. 

3. [|n]] X [m], nm — 2(n + m — 2r)r • 0(lg^ m), 2r + l]jr, any¥, and the parity checks on this code 
are all rank-1 matrices. 

Proof. We first generically show how to define an [nm, nm — \T-L\, 2r + 1]f rank-metric code C from 
an r-low-rank-recovery set % and how to use the low-rank-recovery algorithm for % to decode C 
up to r errors. The corollary is then immediate by using the results of Corollaries 7.25, 7.22, 7.26, 
and invoking the efficiency of their low-rank-recovery. 

Define C to be the matrices in the nullspace of %. That is, C = {M : {M,C) = 0}. It is clear 
that C is a subspace (and assuming that the matrices in T-L are linearly independent, which is true 
for the low-rank-recovery sets ^?2rnm ^2rnm) dimension nm — 

Now consider some T € C and matrix E with rank(i?) < r. Abusing notation, consider T and 
E as nm-long vectors, and % as a. \1-L\ x nm matrix. It follows that 1-L[T + E) = TiE as T G C. As 
Ti is an r-low-rank-recovery set, it follows that we can recover E from T-LE, and thus can recover T, 
performing successful decoding of up to r errors. By Lemma 8.4 we see that the minimum distance 
of this code is > 2r + 1. □ 

We now separately state the result for tensors, which is proved exactly as the above corollary, 
but using the relevant low-rank-recovery results for tensors. 

Corollary 8.6. Let¥ be afield, n, r > 1 and d > 2. Then there are po\y{{2nd)'^,{2r)'^^^^'^^)-explicit 
rank-metric codes with po\y{{2nd)'^ , {2r)^^^^'^^)-time decoding for up to r errors, with parameters: 

1. [[[n]]'^,n'^ — dn{2r)^^^'^ ,2r + 1]if, if |F| > {2nd)'^ , and the parity checks on this code are all 
rank-1 tensors, 

2. [lnf,n'^ - dnr'^^^'^ ■ 0{d\g{2dn)),2r + l]^, any F, 

dnr^lg'^ . C'((dlg(2dn))'^), 2r + 1] F, any ¥, and the parity checks on this code are 
all rank-1 tensors. 

9 Discussion 

We briefly discuss some directions for further research. 

Reducing Noisy Low-Rank Recovery to Noisy Sparse Recovery We showed in Theo- 
rem 7.19 that low-rank-recovery of matrices can be done using any sparse-recovery oracle. This 
reduction was for non-adaptive measurements, and was done in the presence of no noise. As much 
of the compressed sensing community is interested in the noisy case (so M is only close to rank 
< r) the main open question of this work is whether the reduction extends to the noisy case. 
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Smaller Hitting Sets While the observations of Roth [Rot91] show that our hitting set for 
matrices is optimal over algebraically closed fields, our results (Corollary 6.18) over tensors with 
d > 2 are much larger than the existential bounds of Lemma 3.13. Can these hitting sets be 
improved to size O [poly {d)nr'') for k = 0{1)7 As mentioned in the preliminaries (Lemma 3.14), 
any such hitting set with k < 2 would yield improved tensor-rank lower bounds (and thus circuit 
lower bounds) for odd d such as d = 3. However, as the best tensor-rank lower bounds for d = 3 
are G(n) and our hitting set (over infinite fields) yields this bound (with a smaller constant), 
even improving our hitting set for d = 3 by constant factors could yield interesting new results. 
Specifically, for d = 3 can one construct (say over infinite fields) a hitting set of size < nr^/10 for 
[nj^ tensors of rank < r? 

Better Variable Reduction Theorem 5.1 shows that a bivariate polynomial with bounded 
individual degrees can be identity tested by identity testing a collection of univariate polynomials, 
where the size of this collection depends on the rank of bivariate polynomial. This naturally led to 
our hitting sets for matrices. We generalized this to d-variate polynomials in Theorem 6.6, but the 
collection of univariate polynomials has a size with a much worse dependence on the tensor-rank 
of the d-variate polynomial and is much less explicit. Can the size of the collection be reduced, or 
can the explicitness of this set be only polynomially larger than its size? We note that according to 
Lemma 3.14 a more explicit hitting set will yield lower bounds on tensor rank, however for tensors 
of high degrees such lower bounds are known [NW96]. 

Large Field Simulation The results of Section 6.3 show that hitting sets (and LRR sets) that 
involve tensors over an extension field imply hitting sets (and low-rank recovery sets) over the base 
field. While Proposition 6.16 shows that we can preserve the rank-1 property of these tensors while 
doing so, it introduces an exp(d) factor in the size of the hitting set. Can this be improved? 
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A Cauchy-Binet Formula 



For completeness we give the proof of the Cauchy-Binet formula here. 

Lemma A.l (Cauchy-Binet Formula). Letm>n>l. Let A G F"^™, B G F"^". For S C {mj, 
let As be the n x \S\ matrix formed from A by taking the columns with indices in S. Let Bs be 
defined analogously, but with rows. Then 

det{AB)= det(^5)det(Ss) 

Proof. Let C be an m x m diagonal matrix with the variables xi, . . . , Xm on the diagonal. Define 
the polynomial f{xi, . . . , x„i)'= det{AC B) , so that /(I, . . . , 1) = det{AB). Every entry of ACB is 
a homogeneous linear function in xi, . . . , Xm, which implies (as the determinant is homogeneous of 
degree n) that / is homogeneous of degree n, or zero. Let S G {^^^) and consider all monomials 
only containing variables in {xi \ i £ S}. Note that also consider monomials with individual degrees 
above 1. Each monomial of degree n (and thus each monomial with non-zero coefficient in /) must 
be associated with some such S. 

Define ps to be the vector of variables when the substitution Xj i— >• is performed for i ^ S. 
It follows then that f{ps) = dei{AsCsBs) = det(^5) det(i?5) • nies^*' where the last equality 
follows as As, Bs and Cs are all nx n matrices. By the above reasoning, this implies that the only 
monomials with non-zero coefficients in / are monomials of the form Hies ^^'^ such monomials 
have coefficient det(A5') det(i?s). Thus / = X^^ggj-imj-j det{As) det{Bs)Y\i^s det{AB) = 

/(I, • • • , 1) = Esg([™l) det{As) det{Bs), yielding the claim. □ 
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